繼1-SparkContext之後,首先了解下JobProgressListener。
JobProgressListener的scala源文件所在package:
package org.apache.spark.ui.jobs
以及文件開頭對類的註釋:
/**
* :: DeveloperApi ::
* Tracks task-level information to be displayed in the UI.
*
* All access to the data structures in this class must be synchronized on the
* class, since the UI thread and the EventBus loop may otherwise be reading and
* updating the internal data structures concurrently.
*/
@DeveloperApi
class JobProgressListener(conf: SparkConf) extends SparkListener with Logging {
......
從中可以看出,這個類主要是爲UI服務的。其大致流程可以推測:ListenerBus在收到Job相關的消息或事件(最小到task級別)之後,調用JobProgressListener的相關Api更新Job的狀態(包括Job, stage, task, ExecutorMetrics, BlockManager, Application)。
提供的接口如下(源碼拷貝,也可參考官方Api文檔):
override def onJobStart(jobStart: SparkListenerJobStart): Unit
override def onJobEnd(jobEnd: SparkListenerJobEnd): Unit
override def onStageCompleted(stageCompleted: SparkListenerStageCompleted): Unit
override def onStageSubmitted(stageSubmitted: SparkListenerStageSubmitted): Unit
override def onTaskStart(taskStart: SparkListenerTaskStart): Unit
override def onTaskGettingResult(taskGettingResult: SparkListenerTaskGettingResult)
override def onTaskEnd(taskEnd: SparkListenerTaskEnd): Unit
def updateAggregateMetrics(
stageData: StageUIData,
execId: String,
taskMetrics: TaskMetrics,
oldMetrics: Option[TaskMetricsUIData])
override def onExecutorMetricsUpdate(executorMetricsUpdate: SparkListenerExecutorMetricsUpdate)
override def onEnvironmentUpdate(environmentUpdate: SparkListenerEnvironmentUpdate)
override def onBlockManagerAdded(blockManagerAdded: SparkListenerBlockManagerAdded)
override def onBlockManagerRemoved(blockManagerRemoved: SparkListenerBlockManagerRemoved)
override def onApplicationStart(appStarted: SparkListenerApplicationStart)
override def onApplicationEnd(appEnded: SparkListenerApplicationEnd)
關於這個scala源文件中出現的一些我想順帶了解的問題:
- java裏面的註釋,@DeveloperApi
package org.apache.spark.annotation;
import java.lang.annotation.*;
/**
* A lower-level, unstable API intended for developers.
*
* Developer API's might change or be removed in minor versions of Spark.
*
* NOTE: If there exists a Scaladoc comment that immediately precedes this annotation, the first
* line of the comment must be ":: DeveloperApi ::" with no trailing blank line. This is because
* of the known issue that Scaladoc displays only either the annotation or the comment, whichever
* comes first.
*/
@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.TYPE, ElementType.FIELD, ElementType.METHOD, ElementType.PARAMETER,
ElementType.CONSTRUCTOR, ElementType.LOCAL_VARIABLE, ElementType.PACKAGE})
public @interface DeveloperApi {}
註釋Policy 類型有3中,其中RUNTIME類型會被編譯器標記並且在VM運行時保留,可以通過反射機制可以獲取被標記的Target:
package java.lang.annotation;
/**
* Annotation retention policy. The constants of this enumerated type
* describe the various policies for retaining annotations. They are used
* in conjunction with the {@link Retention} meta-annotation type to specify
* how long annotations are to be retained.
* * @author Joshua Bloch
* @since 1.5
*/
public enum RetentionPolicy {
/**
* Annotations are to be discarded by the compiler.
*/
SOURCE,
/**
* Annotations are to be recorded in the class file by the compiler
* but need not be retained by the VM at run time. This is the default
* behavior.
*/
CLASS,
/**
* Annotations are to be recorded in the class file by the compiler and
* retained by the VM at run time, so they may be read reflectively.
*
* @see java.lang.reflect.AnnotatedElement
*/
RUNTIME
}
- 疑惑之處,暫且標記
InterableLike.scala的drop函數while循環中it.next是不是需要賦值給it(scalaVersion := “2.10.3”),有了解的朋友可以賜教
override /*TraversableLike*/ def drop(n: Int): Repr = {
val b = newBuilder
val lo = math.max(0, n)
b.sizeHint(this, -lo)
var i = 0
val it = iterator
while (i < n && it.hasNext) {
it.next
i += 1
}
(b ++= it).result
}