Spark Sql Join 全解

JOIN類型

  • inner 默認連接,連接左右都匹配上的
  • cross 笛卡尓積
  • outer, full, full_outer 左右的結果全部列出,沒匹配上的NULL代替
  • left, left_outer 左邊的全要,沒匹配上的NULL代替
  • right, right_outer 右邊的全要,沒匹配上的NULL代替
  • left_semi 鍵在右邊出現,只包括左邊的,就是inner去掉右邊行
  • left_anti 鍵在右邊出現沒有出現,只保留左側的

參考代碼

import org.apache.spark.sql.SparkSession

/**
  * 
  * date 2020/7/5
  */
object TestSparkSql extends App {


  val spark = SparkSession
    .builder()
    .appName("Spark SQL basic example")
    .master("local[2]")
    .getOrCreate()

  import spark.implicits._

  val table1 = Seq((1, "beijing"), (2, "chongqing"), (3, "shanghai")).toDF("id", "base").as("table1")
  val table2 = Seq((1, "北京"), (2, "重慶"), (4, "成都")).toDF("id", "name").as("table2")

 


  //默認join
  println("------------------默認 join style (inner)----------------")
  table1.join(table2, $"table1.id" === $"table2.id").show()


  //inner
  println("------------------inner style----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "inner").show()
  
  
  println("------------------outer style----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "outer").show()
  
  //cross
  println("------------------cross style----------------")
  table1.crossJoin(table2).show()

  //full
  println("------------------full style----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "full").show()


  println("------------------full_outer style----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "full_outer").show()


  println("------------------left----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "left").show()

  println("------------------left_outer----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "left_outer").show()

  println("------------------right----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "right").show()

  println("------------------right_outer----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "right_outer").show()


  println("------------------left_semi----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "left_semi").show()

  println("------------------left_anti----------------")
  table1.join(table2, $"table1.id" === $"table2.id", "left_anti").show()


}

結果

------------------默認 join style----------------
+---+---------+---+----+
| id|     base| id|name|
+---+---------+---+----+
|  1|  beijing|  1|  北京|
|  2|chongqing|  2|  重慶|
+---+---------+---+----+

------------------inner style----------------
+---+---------+---+----+
| id|     base| id|name|
+---+---------+---+----+
|  1|  beijing|  1|  北京|
|  2|chongqing|  2|  重慶|
+---+---------+---+----+

------------------cross style----------------

+---+---------+---+----+
| id|     base| id|name|
+---+---------+---+----+
|  1|  beijing|  1|  北京|
|  1|  beijing|  2|  重慶|
|  1|  beijing|  4|  成都|
|  2|chongqing|  1|  北京|
|  2|chongqing|  2|  重慶|
|  2|chongqing|  4|  成都|
|  3| shanghai|  1|  北京|
|  3| shanghai|  2|  重慶|
|  3| shanghai|  4|  成都|
+---+---------+---+----+


------------------full style----------------

+----+---------+----+----+
|  id|     base|  id|name|
+----+---------+----+----+
|   1|  beijing|   1|  北京|
|   3| shanghai|null|null|
|null|     null|   4|  成都|
|   2|chongqing|   2|  重慶|
+----+---------+----+----+

------------------full_outer style----------------

+----+---------+----+----+
|  id|     base|  id|name|
+----+---------+----+----+
|   1|  beijing|   1|  北京|
|   3| shanghai|null|null|
|null|     null|   4|  成都|
|   2|chongqing|   2|  重慶|
+----+---------+----+----+

------------------left----------------
+---+---------+----+----+
| id|     base|  id|name|
+---+---------+----+----+
|  1|  beijing|   1|  北京|
|  2|chongqing|   2|  重慶|
|  3| shanghai|null|null|
+---+---------+----+----+

------------------left_outer----------------
+---+---------+----+----+
| id|     base|  id|name|
+---+---------+----+----+
|  1|  beijing|   1|  北京|
|  2|chongqing|   2|  重慶|
|  3| shanghai|null|null|
+---+---------+----+----+

------------------right----------------

+----+---------+---+----+
|  id|     base| id|name|
+----+---------+---+----+
|   1|  beijing|  1|  北京|
|   2|chongqing|  2|  重慶|
|null|     null|  4|  成都|
+----+---------+---+----+

------------------right_outer----------------

+----+---------+---+----+
|  id|     base| id|name|
+----+---------+---+----+
|   1|  beijing|  1|  北京|
|   2|chongqing|  2|  重慶|
|null|     null|  4|  成都|
+----+---------+---+----+

------------------left_semi----------------
+---+---------+
| id|     base|
+---+---------+
|  1|  beijing|
|  2|chongqing|
+---+---------+

------------------left_anti----------------
+---+--------+
| id|    base|
+---+--------+
|  3|shanghai|
+---+--------+


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章