YARN executor launch context
env:
CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_libs__/*<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/*<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/lib/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
SPARK_YARN_STAGING_DIR -> file:/Users/User/.sparkStaging/application_1479802900835_0001
SPARK_USER -> User
SPARK_YARN_MODE -> true
command:
{{JAVA_HOME}}/bin/java \
-server \
-Xmx2048m \
-Djava.io.tmpdir={{PWD}}/tmp \
-Dspark.yarn.app.container.log.dir=<LOG_DIR> \
-XX:OnOutOfMemoryError='kill %p' \
org.apache.spark.executor.CoarseGrainedExecutorBackend \
--driver-url \
spark://[email protected]:53279 \
--executor-id \
<executorId> \
--hostname \
<hostname> \
--cores \
1 \
--app-id \
application_1479802900835_0001 \
--user-class-path \
file:$PWD/__app__.jar \
1><LOG_DIR>/stdout \
2><LOG_DIR>/stderr
Application Context
if (isClusterMode) {
runDriver(securityMgr)
} else {
runExecutorLauncher(securityMgr)
}
runDriver(Cluster模式)
- startUserApplication():
在新線程中啓動用戶類 - runAMEndpoint:
創建AM endpoint,返回driver endpoint的引用 - registerAM:
- 在RM上註冊AM,返回YarnAllocator
- allocator.allocateResources()
- userClassThread.join()
- startUserApplication():
runExecutorLauncher(Client模式)
RPCEndpoint
- 生命週期:
constructor -> onStart -> receive* -> onStop
其中receive方法可以被併發調用。
DriverEndpoint
- onStart: send ReviveOffers,供後續工作執行
- receive:
- StatusUpdate
- ReviveOffers => Launch Task
- KillTask
CoarseGrainedExecutorBackend
- onStart: 連接Driver
- receive:
- RegisterExecutor: 在driver上註冊executor
- RegisterExecutorFailed:退出executor
- LaunchTask:在executor中執行任務
- KillTask: 在executor中殺死任務
- StopExecutor:發送shutdown消息User
- Shutdown:停止executor