Spark Sql Join 全解

JOIN类型

  • inner 默认连接,连接左右都匹配上的
  • cross 笛卡尓积
  • outer, full, full_outer 左右的结果全部列出,没匹配上的NULL代替
  • left, left_outer 左边的全要,没匹配上的NULL代替
  • right, right_outer 右边的全要,没匹配上的NULL代替
  • left_semi 键在右边出现,只包括左边的,就是inner去掉右边行
  • left_anti 键在右边出现没有出现,只保留左侧的

参考代码

import org.apache.spark.sql.SparkSession

/**
  * 
  * date 2020/7/5
  */
object TestSparkSql extends App {


  val spark = SparkSession
    .builder()
    .appName("Spark SQL basic example")
    .master("local[2]")
    .getOrCreate()

  import spark.implicits._

  val table1 = Seq((1, "beijing"), (2, "chongqing"), (3, "shanghai")).toDF("id", "base").as("table1")
  val table2 = Seq((1, "北京"), (2, "重庆"), (4, "成都")).toDF("id", "name").as("table2")

 


  //默认join
  println("------------------默认 join style (inner)----------------")
  table1.join(table2, $"table1.id" === $"table2.id").show()


  //inner
  println("------------------inner style----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "inner").show()
  
  
  println("------------------outer style----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "outer").show()
  
  //cross
  println("------------------cross style----------------")
  table1.crossJoin(table2).show()

  //full
  println("------------------full style----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "full").show()


  println("------------------full_outer style----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "full_outer").show()


  println("------------------left----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "left").show()

  println("------------------left_outer----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "left_outer").show()

  println("------------------right----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "right").show()

  println("------------------right_outer----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "right_outer").show()


  println("------------------left_semi----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "left_semi").show()

  println("------------------left_anti----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "left_anti").show()


}

结果

------------------默认 join style----------------
+---+---------+---+----+
| id|     base| id|name|
+---+---------+---+----+
|  1|  beijing|  1|  北京|
|  2|chongqing|  2|  重庆|
+---+---------+---+----+

------------------inner style----------------
+---+---------+---+----+
| id|     base| id|name|
+---+---------+---+----+
|  1|  beijing|  1|  北京|
|  2|chongqing|  2|  重庆|
+---+---------+---+----+

------------------cross style----------------

+---+---------+---+----+
| id|     base| id|name|
+---+---------+---+----+
|  1|  beijing|  1|  北京|
|  1|  beijing|  2|  重庆|
|  1|  beijing|  4|  成都|
|  2|chongqing|  1|  北京|
|  2|chongqing|  2|  重庆|
|  2|chongqing|  4|  成都|
|  3| shanghai|  1|  北京|
|  3| shanghai|  2|  重庆|
|  3| shanghai|  4|  成都|
+---+---------+---+----+


------------------full style----------------

+----+---------+----+----+
|  id|     base|  id|name|
+----+---------+----+----+
|   1|  beijing|   1|  北京|
|   3| shanghai|null|null|
|null|     null|   4|  成都|
|   2|chongqing|   2|  重庆|
+----+---------+----+----+

------------------full_outer style----------------

+----+---------+----+----+
|  id|     base|  id|name|
+----+---------+----+----+
|   1|  beijing|   1|  北京|
|   3| shanghai|null|null|
|null|     null|   4|  成都|
|   2|chongqing|   2|  重庆|
+----+---------+----+----+

------------------left----------------
+---+---------+----+----+
| id|     base|  id|name|
+---+---------+----+----+
|  1|  beijing|   1|  北京|
|  2|chongqing|   2|  重庆|
|  3| shanghai|null|null|
+---+---------+----+----+

------------------left_outer----------------
+---+---------+----+----+
| id|     base|  id|name|
+---+---------+----+----+
|  1|  beijing|   1|  北京|
|  2|chongqing|   2|  重庆|
|  3| shanghai|null|null|
+---+---------+----+----+

------------------right----------------

+----+---------+---+----+
|  id|     base| id|name|
+----+---------+---+----+
|   1|  beijing|  1|  北京|
|   2|chongqing|  2|  重庆|
|null|     null|  4|  成都|
+----+---------+---+----+

------------------right_outer----------------

+----+---------+---+----+
|  id|     base| id|name|
+----+---------+---+----+
|   1|  beijing|  1|  北京|
|   2|chongqing|  2|  重庆|
|null|     null|  4|  成都|
+----+---------+---+----+

------------------left_semi----------------
+---+---------+
| id|     base|
+---+---------+
|  1|  beijing|
|  2|chongqing|
+---+---------+

------------------left_anti----------------
+---+--------+
| id|    base|
+---+--------+
|  3|shanghai|
+---+--------+


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章