Spark RangeDependency 區間依賴關係
- Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs.
更多資源
- SPARK 源碼分析技術分享(bilibilid視頻彙總套裝視頻): https://www.bilibili.com/video/av37442139/
- github: https://github.com/opensourceteams/spark-scala-maven
- csdn(彙總視頻在線看): https://blog.csdn.net/thinktothings/article/details/84726769
youtub視頻演示
- https://youtu.be/_4DeWWPQubc (youtube視頻)
- https://www.bilibili.com/video/av37442139/?p=2(bilibili視頻)
- github: https://github.com/opensourceteams/spark-scala-maven
<iframe src="//player.bilibili.com/player.html?aid=37442139&cid=65822246&page=2" scrolling="no" border="0" frameborder="no" framespacing="0" allowfullscreen="true"> </iframe>
輸入數據
c.txt
a bc
a
a.txt
a b
c a
處理程序scala
package com.opensource.bigdata.spark.local.rdd.operation.dependency.narrow.n_02_RangeDependency
import com.opensource.bigdata.spark.local.rdd.operation.base.BaseScalaSparkContext
object Run3 extends BaseScalaSparkContext{
def main(args: Array[String]): Unit = {
val sc = pre()
val rdd1 = sc.textFile("/opt/data/2/c.txt",2)
val rdd2 = sc.textFile("/opt/data/2/a.txt",2)
val rdd3 = rdd1.union(rdd2)
println(rdd3.collect().mkString("\n"))
sc.stop()
}
}