博客地址: http://blog.csdn.net/yueqian_zhu/
shuffle的讀流程也是從compute方法開始的
override def compute(split: Partition, context: TaskContext): Iterator[(K, C)] = {
val dep = dependencies.head.asInstanceOf[ShuffleDependency[K, V, C]]
SparkEnv.get.shuffleManager.getReader(dep.shuffleHandle, split.index, split.index + 1, context)
.read()
.asInstanceOf[Iterator[(K, C)]]
}
目前來說,不管是sortShuffleManager還是hashShuffleManager,getReader方法返回的都是HashShuffleReader。
接着調用read方法,如下:
/** Read the combined key-values for this reduce task */
override def read(): Iterator[Product2[K, C]] = {
val ser = Serializer.getSerializer(dep.serializer)
val iter = BlockStoreShuffleFetcher.fetch(handle.shuffleId, startPartition, context, ser)
val aggregatedIter: Iterator[Product2[K, C]] = if (dep.aggregator.isDefined) {
if (dep.mapSideCombine) {
new InterruptibleIterator(context, dep.aggregator.get.combineCombinersByKey(iter, context))
} else {
new InterruptibleIterator(context, dep.aggregator.get.combineValuesByKey(iter, context))
}
} else {
require(!dep.mapSideCombine, "Map-side combine without Aggregator specified!")
// Convert the Product2s to pairs since this is what downstream RDDs currently expect
iter.asInstanceOf[Iterator[Product2[K, C]]].map(pair => (pair._1, pair._2))
}
// Sort the output if there is a sort ordering defined.
dep.keyOrdering match {
case Some(keyOrd: Ordering[K]) =>
// Create an ExternalSorter to sort the data. Note that if spark.shuffle.spill is disabled,
// the ExternalSorter won't spill to disk.
val sorter = new ExternalSorter[K, C, C](ordering = Some(keyOrd), serializer = Some(ser))
sorter.insertAll(aggregatedIter)
context.taskMetrics.incMemoryBytesSpilled(sorter.memoryBytesSpilled)
context.taskMetrics.incDiskBytesSpilled(sorter.diskBytesSpilled)
sorter.iterator
case None =>
aggregatedIter
}
}
該方法首先調用了fetch方法,介紹一下
1、在task運行那節介紹過,shuffleMapTask運行完成後,會將shuffleId及mapstatus的映射註冊到mapOutputTracker中
2、fetch方法首先嚐試在本地mapstatuses中查找是否有該shuffleId的信息,有則本地取;否則想master的mapOutputTracker請求並讀取,返回塊管理器的地址和對應partition的文件長度
3、然後根據我們得到的shuffleId等信息去remote或者local通過netty/nio讀取,返回一個迭代器
4、返回的迭代器中的數據並不是全部在內存中的,讀取時會根據配置的內存最大值來讀取。內存不夠的話,下一個待讀取
fetch方法返回一個迭代器後,根據是否mapSideCombine來區分時候需要將讀取到的數據進行合併操作。合併過程與寫流程類似,內存放不下就寫入本地磁盤。
如果還需要keyOrdering的,new一個ExternalSorter進行外部排序。之後也是同shuffle寫流程的insertAll。