前幾篇博客詳細解析了Spark的Job觸發機制、Spark的DAGScheduler調度機制、Spark的TaskScheduler調度機制、Spark調度器的終端通信SchedulerBackend和Spark的Executor啓動,在對這些源碼進行分析之後,頭腦中十分混亂,對於各個機制的具體執行過程不是十分了解。網上的各種分析博客也寫得不是十分清晰,於是就開始了Spark任務提交流程的分析。本博客的Spark版本爲2.12,是以Standalone Cluster部署模式爲基礎進行分析。
Spark的任務提交過程
1. 總覽
本節主要從整體上對Spark提交任務的流程。spark應用程序可以以Client模式和Cluster啓動,區別在於Client模式下的Driver是在執行spark-submit命令節點上啓動的,而Cluster模式下是Master隨機選擇的一臺Worker通過DriverWrapper來啓動Driver的。
整個任務提交的流程大致如下所示:
- 通過spark-submit提交會調用SparkSubmit類,SparkSubmit類裏通過反射調用Client,Client與Master通信來SubmitDriver,收到成功回覆後退出JVM(SparkSubmit進程退出)。
- Master收到SubmitDriver後會隨機選擇一臺能滿足driver資源需求的Worker,然後與對應Worker通信發送啓動driver的消息。Worker收到消息後根據driver的信息等來拼接成linux命令來啓動DriverWrapper,在該類裏面再啓動driver,最後將Driver執行狀態返回給Master。
- driver啓動後接下來就是註冊APP,在SparkContext啓動過程中會通過創建AppClient並與Master通信要求註冊application。
- Master收到消息後會去調度執行這個application,通過調度算法獲取該application需要在哪些Worker上啓動executor,接着與對應的Worker通信發送啓動Executor的消息。
- Worker收到消息後通過拼接linux命令,啓動了CoarseGrainedExecutorBackend進程,接着向Driver通信進行Executor的註冊,成功註冊後會在CoarseGrainedExecutorBackend中創建Executor對象。接着就是job的執行。
2. 申請創建Driver
我們通常會通過shell執行spark-submit 命令來提交spark應用程序,該腳本的運行命令如下所示:
./bin/spark-submit \
--class <main-class> \
--master <master-url> \
--deploy-mode <deploy-mode> \
--conf <key>=<value> \
... # other options
<application-jar> \
[application-arguments]
一些常用的選項是:
–class:應用程序的入口點,main函數所在的類(例如org.apache.spark.examples.SparkPi)
–master:羣集的主網址(例如spark://23.195.26.187:7077)
–deploy-mode:是否將驅動程序部署在工作節點(cluster)上,或作爲外部客戶機(client)本地部署(默認值:client)
–conf:Key = value格式的任意Spark配置屬性。對於包含空格的值,用引號括起“key = value”(參見示例)。
application-jar:包含應用程序和所有依賴關係的捆綁jar的路徑。該URL必須在集羣內全局可見,例如hdfs://路徑或所有節點上存在的file://路徑。
application-arguments:參數傳遞給主類的main方法(如果有的話)常見的部署策略是從與您的工作機器物理上位於的網關機器提交應用程序(例如,獨立的EC2集羣中的主節點)。在此設置中,client模式是適當的。在client模式下,驅動程序直接在spark-submit過程中啓動,該過程充當集羣的客戶端。應用程序的輸入和輸出連接到控制檯。因此,該模式特別適用於涉及REPL(例如Spark shell)的應用。
通過查看spark-submit腳本可以發現,其實際是使用自定義的參數運行Spark中的org.apache.spark.deploy.SparkSubmit類,下面我們從SparkSubmit的main函數開始分析,其主要源代碼如下所示:
override def main(args: Array[String]): Unit = {
val submit = new SparkSubmit() {
self =>..........
}
submit.doSubmit(args)
}
從代碼中可以看出,其首先創建SparkSubmit類的實例,並調用doSubmit方法,傳入我們設置的一些參數。doSubmit的相關代碼如下所示:
def doSubmit(args: Array[String]): Unit = {
// 初始化日誌(如果尚未完成。跟蹤應用程序啓動之前是否需要重置日誌記錄。
val uninitLog = initializeLogIfNecessary(true, silent = true)
//解析傳入的參數並封裝成SparkSubmitArguments對象
val appArgs = parseArguments(args)
if (appArgs.verbose) {
logInfo(appArgs.toString)
}
appArgs.action match {//匹配提交的任務類型
case SparkSubmitAction.SUBMIT => submit(appArgs, uninitLog)
case SparkSubmitAction.KILL => kill(appArgs)
case SparkSubmitAction.REQUEST_STATUS => requestStatus(appArgs)
case SparkSubmitAction.PRINT_VERSION => printVersion()
}
}
上述代碼中關鍵部分在於針對不同的任務類型,來執行不同操作,在這裏我們是提交任務,所以匹配的是SparkSubmitAction.SUBMIT,那麼將會調用submit方法,其關鍵代碼如下:
private def submit(args: SparkSubmitArguments, uninitLog: Boolean): Unit = {
def doRunMain(): Unit = {
if (args.proxyUser != null) {//查看是否設置用戶權限??
val proxyUser = UserGroupInformation.createProxyUser(args.proxyUser,
UserGroupInformation.getCurrentUser())
try {
proxyUser.doAs(new PrivilegedExceptionAction[Unit]() {//使用當前用戶來運行程序,可能會權限不夠
override def run(): Unit = {
runMain(args, uninitLog)
}
})
} catch {
..........
}
} else {
runMain(args, uninitLog)
}
}
if (args.isStandaloneCluster && args.useRest) {//判斷是否是StandaloneCluster部署模式,並且使用基於REST的方式
try {
logInfo("Running Spark using the REST application submission protocol.")
doRunMain()
} catch {
................
}
// 在所有其他模式下,只需按準備好的方式運行主類
} else {
doRunMain()
}
}
從上述代碼中可以看出,在submit方法中首先對部署模式進行判斷,但其最終都是調用內部的doRunMain方法,在doRunMain方法中首先會考慮用戶權限的問題,如果設置了權限,則按照給定的權限執行任務,否則按照普通方式執行。兩者都調用了runMain方法,其關鍵代碼如下所示:
private def runMain(args: SparkSubmitArguments, uninitLog: Boolean): Unit = {
//準備提交應用程序的環境,根據傳遞的參數獲取參數
val (childArgs, childClasspath, sparkConf, childMainClass) = prepareSubmitEnvironment(args)
...................
val loader = getSubmitClassLoader(sparkConf)
//添加jar包
for (jar <- childClasspath) {
addJarToClasspath(jar, loader)
}
var mainClass: Class[_] = null
try {
//通過反射來獲取應用程序子類
mainClass = Utils.classForName(childMainClass)
} catch {
.............
}
//根據剛纔獲取的類來創建實例。不同的部署模式具體實例不同,但是都是SparkApplication的子類
val app: SparkApplication = if (classOf[SparkApplication].isAssignableFrom(mainClass)) {
mainClass.getConstructor().newInstance().asInstanceOf[SparkApplication]
} else {
new JavaMainApplication(mainClass)
}
................
try {
//調用start方法,來啓動應用程序
app.start(childArgs.toArray, sparkConf)
} catch {
.............
}
}
runMain方法中首先會調用prepareSubmitEnvironment方法來獲取提交應用程序需要的一些參數,其中childMainClass是應用程序主類,部署模式不同加載的主類不同。由於本篇博客是基於Standalone Cluster部署模式的,下面給出prepareSubmitEnvironment方法中關於該部署模式的childMainClass賦值語句:
private[deploy] val REST_CLUSTER_SUBMIT_CLASS = classOf[RestSubmissionClientApp].getName()
private[deploy] val STANDALONE_CLUSTER_SUBMIT_CLASS = classOf[ClientApp].getName()
if (args.isStandaloneCluster) {
if (args.useRest) {
childMainClass = REST_CLUSTER_SUBMIT_CLASS
childArgs += (args.primaryResource, args.mainClass)
} else {
// In legacy standalone cluster mode, use Client as a wrapper around the user class
childMainClass = STANDALONE_CLUSTER_SUBMIT_CLASS
if (args.supervise) { childArgs += "--supervise" }
Option(args.driverMemory).foreach { m => childArgs += ("--memory", m) }
Option(args.driverCores).foreach { c => childArgs += ("--cores", c) }
childArgs += "launch"
childArgs += (args.master, args.primaryResource, args.mainClass)
}
if (args.childArgs != null) {
childArgs ++= args.childArgs
}
}
從上述代碼中可以看出,StandaloneCluster集羣模式也會分爲兩種情況,分別是使用Rest和不使用Rest,本博客中以不使用Rest爲例進行介紹。我們可以看出,當不使用Rest時,childMainClass所指定的主類爲ClientApp。回到runMain方法中,當獲取提交應用程序需要的配置之後,首先通過反射來獲取應用程序子類,然後創建該類的實例對象,並且調用start方法啓動應用程序。下面給出ClientApp中start方法的源代碼:
override def start(args: Array[String], conf: SparkConf): Unit = {
//將參數封裝爲ClientArguments對象
val driverArgs = new ClientArguments(args)
//設置RPC請求等待時間(過期時間)
if (!conf.contains(RPC_ASK_TIMEOUT)) {
conf.set(RPC_ASK_TIMEOUT, "10s")
}
//日誌級別
Logger.getRootLogger.setLevel(driverArgs.logLevel)
//創建RPC運行環境
val rpcEnv =
RpcEnv.create("driverClient", Utils.localHostName(), 0, conf, new SecurityManager(conf))
//設置並獲取Master端的RPC通信端點
val masterEndpoints = driverArgs.masters.map(RpcAddress.fromSparkURL).
map(rpcEnv.setupEndpointRef(_, Master.ENDPOINT_NAME))
//創建並設置client的通信端點ClientEndpoint
rpcEnv.setupEndpoint("client", new ClientEndpoint(rpcEnv, driverArgs, masterEndpoints, conf))
//等待終止
rpcEnv.awaitTermination()
}
從上面代碼中可以看出,ClientApp的start方法首先將參數封裝成ClientArguments,然後創建RPC運行環境並設置Master的RPC通信端點,最後創建並設置Client端的通信端點ClientEndpoint。創建ClientEndpoint之後會首先調用其onStart方法,具體代碼如下:
override def onStart(): Unit = {
driverArgs.cmd match {
case "launch" =>
//執行主類
val mainClass = "org.apache.spark.deploy.worker.DriverWrapper"
//獲取並封裝Driver啓動時所需要的參數配置
val classPathConf = config.DRIVER_CLASS_PATH.key
val classPathEntries = getProperty(classPathConf, conf).toSeq.flatMap { cp =>
cp.split(java.io.File.pathSeparator)
}
val libraryPathConf = config.DRIVER_LIBRARY_PATH.key
val libraryPathEntries = getProperty(libraryPathConf, conf).toSeq.flatMap { cp =>
cp.split(java.io.File.pathSeparator)
}
val extraJavaOptsConf = config.DRIVER_JAVA_OPTIONS.key
val extraJavaOpts = getProperty(extraJavaOptsConf, conf)
.map(Utils.splitCommandString).getOrElse(Seq.empty)
val sparkJavaOpts = Utils.sparkJavaOpts(conf)
val javaOpts = sparkJavaOpts ++ extraJavaOpts
//獲取並封裝Command命令,用於後續啓動Driver
val command = new Command(mainClass,
Seq("{{WORKER_URL}}", "{{USER_JAR}}", driverArgs.mainClass) ++ driverArgs.driverOptions,
sys.env, classPathEntries, libraryPathEntries, javaOpts)
val driverResourceReqs = ResourceUtils.parseResourceRequirements(conf,
config.SPARK_DRIVER_PREFIX)
//將參數配置封裝成DriverDescription對象
val driverDescription = new DriverDescription(
driverArgs.jarUrl,
driverArgs.memory,
driverArgs.cores,
driverArgs.supervise,
command,
driverResourceReqs)
//發送消息給Master並且將返回結果異步轉發給自己
asyncSendToMasterAndForwardReply[SubmitDriverResponse](
//向Master提交請求提交Driver
RequestSubmitDriver(driverDescription))
case "kill" =>
val driverId = driverArgs.driverId
asyncSendToMasterAndForwardReply[KillDriverResponse](RequestKillDriver(driverId))
}
}
從上述代碼中可以看出,onStart方法中配置了Driver啓動時的主類以及一些參數配置,然後利用RPC通信方式向Master發送啓動Driver的消息RequestSubmitDriver,到此也就完成了申請創建Driver過程,將上述部分過程總結一下可以畫出下面的時序圖。其中Maser端的代碼在下一節分析。
3. 創建Driver
在第1部分我們分析了從shell命令提交任務到向Master申請創建Driver的過程,在本節中我們詳細分析Driver的創建過程,首先master端收到RequestSubmitDriver消息之後的會具體創建Driver,其關鍵代碼如下所示:
override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = {
case RequestSubmitDriver(description) =>
if (state != RecoveryState.ALIVE) {//判斷當前Master狀態,不處於活躍狀態則不能啓動
val msg = s"${Utils.BACKUP_STANDALONE_MASTER_PREFIX}: $state. " +
"Can only accept driver submissions in ALIVE state."
context.reply(SubmitDriverResponse(self, false, None, msg))
} else {
logInfo("Driver submitted " + description.command.mainClass)
//將創建Driver所需要的配置封裝成DriverInfo(邏輯上創建Driver)
val driver = createDriver(description)
//持久化存儲該Driver
persistenceEngine.addDriver(driver)
//將新創建的Driver加入待分配資源隊列
waitingDrivers += driver
drivers.add(driver)
//實際分配資源
schedule()
//向Client端返回消息
context.reply(SubmitDriverResponse(self, true, Some(driver.id),
s"Driver successfully submitted as ${driver.id}"))
}
}
Master端收到RequestSubmitDriver消息之後,首先判斷Master的狀態,只有處於活躍狀態纔可以創建Driver。然後將創建Driver所需要的配置封裝成DriverInfo,這其實邏輯上創建Driver。之後持久化存儲該Driver,以便於出錯之後重新創建。將新創建的Driver加入待分配資源隊列等待後續分配資源。最後調用schedule方法來進行資源分配,分配完資源後會將結果返回給Client端。下面分析schedule方法中資源分配的關鍵代碼:
private def schedule(): Unit = {
//Master狀態不爲Alive直接返回
if (state != RecoveryState.ALIVE) {
return
}
//隨機打亂works,防止在同一個works上啓動太多的app,與此同時過濾出Alive狀態的works
val shuffledAliveWorkers = Random.shuffle(workers.toSeq.filter(_.state == WorkerState.ALIVE))
val numWorkersAlive = shuffledAliveWorkers.size
//當前最後一個分配的work下標
var curPos = 0
/**
* 我們以輪循方式爲每個等待的Driver分配work。 對於每個Driver,我們從分配給Driver的最後一個work開始,
* 然後繼續進行,直到我們遍歷所有處於活躍狀態的work。
* */
var launched = false
var isClusterIdle = true
var numWorkersVisited = 0
while (numWorkersVisited < numWorkersAlive && !launched) {//遍歷所有的work,直到driver啓動
val worker = shuffledAliveWorkers(curPos)
//該work上沒有啓動driver和executor
isClusterIdle = worker.drivers.isEmpty && worker.executors.isEmpty
numWorkersVisited += 1
//判斷當前work資源能否啓動該driver
if (canLaunchDriver(worker, driver.desc)) {
//向該work請求driver啓動需要的資源
val allocated = worker.acquireResources(driver.desc.resourceReqs)
//給driver分配申請好的資源
driver.withResources(allocated)
//啓動driver
launchDriver(worker, driver)
//從等待隊列中刪除該driver
waitingDrivers -= driver
//標識啓動成功
launched = true
}
//更新下標,如同一個循環列表
curPos = (curPos + 1) % numWorkersAlive
}
if (!launched && isClusterIdle) {
logWarning(s"Driver ${driver.id} requires more resource than any of Workers could have.")
}
}
//啓動Executor,在這裏不進行介紹
startExecutorsOnWorkers()
}
上述一段代碼與應用程序分配資源相同,在前面的博客中有詳細介紹,代碼中也給出了具體註釋,就不進行具體分析。在給Driver分配完資源後會調用launchDriver方法來啓動Driver,下面我們分析launchDriver中的關鍵代碼:
private def launchDriver(worker: WorkerInfo, driver: DriverInfo): Unit = {
logInfo("Launching driver " + driver.id + " on worker " + worker.id)
//在該work中添加driver
worker.addDriver(driver)
//設置driver的worker
driver.worker = Some(worker)
//向worker端發送啓動driver請求
worker.endpoint.send(LaunchDriver(driver.id, driver.desc, driver.resources))
//設置friver狀態
driver.state = DriverState.RUNNING
}
launchDriver方法中並沒有實際完成Driver的啓動,其僅僅設置driver啓動的worker和driver狀態,然後會向具體分配資源的worker發送啓動Driver消息launchDriver,下面就來看看Worker端的處理過程,首先看Worker接受消息之後的處理步驟:
case LaunchDriver(driverId, driverDesc, resources_) =>
logInfo(s"Asked to launch driver $driverId")
//創建DriverRunner實例
val driver = new DriverRunner(
conf,
driverId,
workDir,
sparkHome,
driverDesc.copy(command = Worker.maybeUpdateSSLSettings(driverDesc.command, conf)),
self,
workerUri,
securityMgr,
resources_)
//添加映射關係
drivers(driverId) = driver
//啓動driver
driver.start()
//更新本work使用的資源
coresUsed += driverDesc.cores
memoryUsed += driverDesc.mem
addResourcesUsed(resources_)
worker接收到LaunchDriver消息之後,首先會創建一個DriverRunner對象用於啓動driver,然後調用其start方法啓動driver,啓動完成之後會更新本worker的資源信息。下面就具體看看DriverRunner的start方法。
private[worker] def start() = {
//啓動線程用於創建和管理driver
new Thread("DriverRunner for " + driverId) {
override def run(): Unit = {
var shutdownHook: AnyRef = null
try {
//用於殺死Driver
shutdownHook = ShutdownHookManager.addShutdownHook { () =>
logInfo(s"Worker shutting down, killing driver $driverId")
kill()
}
//準備driver需要的jar包並運行driver
val exitCode = prepareAndRunDriver()
//根據是否被強制終止並設置退出代碼來設置最終狀態
finalState = if (exitCode == 0) {
Some(DriverState.FINISHED)
} else if (killed) {
Some(DriverState.KILLED)
} else {
Some(DriverState.FAILED)
}
} catch {
case e: Exception =>
kill()
finalState = Some(DriverState.ERROR)
finalException = Some(e)
} finally {
if (shutdownHook != null) {
ShutdownHookManager.removeShutdownHook(shutdownHook)
}
}
//通知worker最終driver狀態和可能出現異常
worker.send(DriverStateChanged(driverId, finalState.get, finalException))
}
}.start()//啓動線程
}
DriverRunner的start方法會創建一個線程來創建和管理driver,在線程的run方法中會設置driver關機引用,然後調用prepareAndRunDriver方法準備driver所需要的jar包並且運行driver,下面來看看prepareAndRunDriver方法:
private[worker] def prepareAndRunDriver(): Int = {
//準備driver創建所需要的資源
val driverDir = createWorkingDirectory()
val localJarFilename = downloadUserJar(driverDir)
val resourceFileOpt = prepareResourcesFile(SPARK_DRIVER_PREFIX, resources, driverDir)
def substituteVariables(argument: String): String = argument match {
case "{{WORKER_URL}}" => workerUrl
case "{{USER_JAR}}" => localJarFilename
case other => other
}
//driver的配置資源文件,該文件將在driver啓動時用於加載資源
val javaOpts = driverDesc.command.javaOpts ++ resourceFileOpt.map(f =>
Seq(s"-D${DRIVER_RESOURCES_FILE.key}=${f.getAbsolutePath}")).getOrElse(Seq.empty)
//構建用於啓動driver進程命令
val builder = CommandUtils.buildProcessBuilder(driverDesc.command.copy(javaOpts = javaOpts),
securityManager, driverDesc.mem, sparkHome.getAbsolutePath, substituteVariables)
runDriver(builder, driverDir, driverDesc.supervise)
}
private def runDriver(builder: ProcessBuilder, baseDir: File, supervise: Boolean): Int = {
builder.directory(baseDir)
def initialize(process: Process): Unit = {
// Redirect stdout and stderr to files
//將stdout和stderr重定向到文件
val stdout = new File(baseDir, "stdout")
CommandUtils.redirectStream(process.getInputStream, stdout)
val stderr = new File(baseDir, "stderr")
val redactedCommand = Utils.redactCommandLineArgs(conf, builder.command.asScala)
.mkString("\"", "\" \"", "\"")
val header = "Launch Command: %s\n%s\n\n".format(redactedCommand, "=" * 40)
Files.append(header, stderr, StandardCharsets.UTF_8)
CommandUtils.redirectStream(process.getErrorStream, stderr)
}
runCommandWithRetry(ProcessBuilderLike(builder), initialize, supervise)
}
private[worker] def runCommandWithRetry(
command: ProcessBuilderLike, initialize: Process => Unit, supervise: Boolean): Int = {
var exitCode = -1
// Time to wait between submission retries.
var waitSeconds = 1
// A run of this many seconds resets the exponential back-off.
val successfulRunDuration = 5
var keepTrying = !killed
val redactedCommand = Utils.redactCommandLineArgs(conf, command.command)
.mkString("\"", "\" \"", "\"")
while (keepTrying) {
logInfo("Launch Command: " + redactedCommand)
synchronized {
if (killed) { return exitCode }
//啓動進程,也就是執行命令
process = Some(command.start())
initialize(process.get)
}
val processStart = clock.getTimeMillis()
//獲取狀態
exitCode = process.get.waitFor()
// check if attempting another run
............
}
exitCode
}
}
prepareAndRunDriver方法中首先會準備driver創建所需要的資源,包括創建目錄、加載jar包和準備資源,然後會創建啓動應用程序的命令。利用準備好的資源和執行命令調用runDriver方法,runDriver方法中主要設置了driver初始化的操作,然後調用runCommandWithRetry方法執行啓動命令。從前面代碼可以知道,這裏命令所執行的主類爲"org.apache.spark.deploy.worker.DriverWrapper",下面就來看看該類的Main方法:
def main(args: Array[String]): Unit = {
args.toList match {
case workerUrl :: userJar :: mainClass :: extraArgs =>
//創建SparkConf
val conf = new SparkConf()
val host: String = Utils.localHostName()
val port: Int = sys.props.getOrElse(config.DRIVER_PORT.key, "0").toInt
//創建RPC環境
val rpcEnv = RpcEnv.create("Driver", host, port, conf, new SecurityManager(conf))
logInfo(s"Driver address: ${rpcEnv.address}")
//設置WorkerWatcher
rpcEnv.setupEndpoint("workerWatcher", new WorkerWatcher(rpcEnv, workerUrl))
.................
// 通過反射來獲取應用程序主類
val clazz = Utils.classForName(mainClass)
//獲取應用程序main方法
val mainMethod = clazz.getMethod("main", classOf[Array[String]])
//使用參數來執行main方法
mainMethod.invoke(null, extraArgs.toArray[String])
rpcEnv.shutdown()
..........................
}
}
在DriverWrapper的main方法中,首先創建Spark的配置文件,然後創建RPC環境並且設置WorkerWatcher。之後會通過反射技術獲取應用程序主類,並且執行其main方法。在我們的應用程序中會首先創建SparkContext,在SparkContext中就會創建DAGScheduler、TaskScheduler以及SchedulerBackend。當遇到Action操作時就會觸發Job。這些部分在前面的博客中已經詳細講解過,在這裏就不再敘述。
在整個源碼閱讀過程中遇到很多問題,有部分已經理解了,但是還是有一部分沒有理解。例如:
- 爲什麼會有兩次RPC環境創建?一次名字爲DriverClient,一次爲Driver,兩者有什麼區別?
- WorkerWatcher具體作用是什麼也還沒有仔細查看
如果喜歡的話希望點贊收藏,關注我,將不間斷更新博客。
希望熱愛技術的小夥伴私聊,一起學習進步
來自於熱愛編程的小白