Spark PruneDependency 依賴關係 Filter

Spark PruneDependency 依賴關係 Filter

  • Represents a dependency between the PartitionPruningRDD and its parent. In this case, the child RDD contains a subset of partitions of the parents'.

更多資源

youtub視頻演示

<iframe src="//player.bilibili.com/player.html?aid=37442139&cid=65822402&page=3" scrolling="no" border="0" frameborder="no" framespacing="0" allowfullscreen="true"> </iframe>

輸入數據

List(("a",2),("d",1),("b",8),("d",3)

處理程序scala

package com.opensource.bigdata.spark.local.rdd.operation.dependency.narrow.n_03_pruneDependency.n_03_filterByRange_filter

import com.opensource.bigdata.spark.local.rdd.operation.base.BaseScalaSparkContext

object Run extends BaseScalaSparkContext{

  def main(args: Array[String]): Unit = {

    val sc = pre()
    val rdd1 = sc.parallelize(List(("a",2),("d",1),("b",8),("d",3)),2)  //ParallelCollectionRDD
    val rdd2 =rdd1.filterByRange("a","b")  //MapParttionsRDD

    println("rdd \n" + rdd2.collect().mkString("\n"))

    sc.stop()
  }

}


數據處理圖

PruneDependency依賴關係圖

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章