从Application提交的角度审视Executor,解密Executor到底是什么时候启动的以及Executor如何把结果交给Application。
Executor何时启动
SparkContext启动后,直接实例化createTaskScheduler方法,createTaskScheduler完成后,调用TaskScheduler的start方法,实际调用的是TaskSchedulerImpl的start方法,在TaskSchedulerImpl的start方法中实现SparkDeploySchedulerBackend(spark2.0后更名为StandaleSchedulerBackend)的start方法。SparkDeploySchedulerBackend的start方法中将commend封装注册给Master,Master转过来要Worker启动具体的Executor,command已经封装好指令,Executor具体要启动进程入口类CoarseGrainedExecutorBackend。然后调用new()函数创建一个AppClient,AppClient中有个名为ClientEndpoint的内部类,在创建ClientEndpoint时会传入Command来指定具体为当前应用程序启动的Executor进行的入口类的名称为CoarseGrainedExecutorBackend。ClientEndpoint继承自ThreadSafeRpcEndpoint,其通过RPC机制完成和Master的通信。在ClientEndpoint的start方法中,会通过registerWithMaster方法向Master发送RegisterApplication请求,Master收到该请求消息后,首先通过registerApplication方法完成信息登记,之后调用schedule方法,在Worker上启动Executor(详细查看Master-Driver和Worker-Executor剖析章节)。
override def receive: PartialFunction[Any, Unit] = { ......
case RegisterApplication(description, driver) => {
if (state == RecoveryState.STANDBY) {
} else {val app = createApplication(description, driver)
registerApplication(app)
persistenceEngine.addApplication(app)
driver.send(RegisteredApplication(app.id, self))
schedule()}
}
......}
Master匹配收到的RegisterApplication请求,先判断Master是否为STANDBY(备用),若不是则为ALIVE状态,调用createApplicaton方法创建applicationInfo,把applicationInfo信息通过registerApplication方法进行注册,通过persistenceEngine.addApplication方法进行持久化,完成后通过driver.send方法向AppClient返回注册成功的信息。
对象ApplicationDescription和RpcEndpointRef传入方法createApplication中,createApplication返回的对象是ApplicationInfo,Master类的createApplication方法的源码:
private def createApplication(desc: ApplicationDescription, driver: RpcEndpointRef):
ApplicationInfo = {
val now = System.currentTimeMillis()
val date = new Date(now)
val appId = newApplicationId(date)
new ApplicationInfo(now, appId, desc, date, driver, defaultCores)}
之后把ApplicationInfo对象传入registerApplication方法,调用registerApplication方法完成application的注册,registerApplication方法的源码:
private def registerApplication(app: ApplicationInfo): Unit = {
val appAddress = app.driver.address //Driver地址,用于Master和Driver通信
if (addressToApp.contains(appAddress)) {
return //若已有Driver地址,则Driver已经注册过了,直接return
}
applicationMetricsSystem.registerSource(app.appSource) //向度量系统注册
apps += app //apps是一个HashSet,保证数据不重复
idToApp(app.id) = app //idToApp是一个HashMap,保存id和app的对应关系
endpointToApp(app.driver) = app //endpoint是HashMap,保存Driver和app的对应关系
addressToApp(appAddress) = app
waitingApps += app
}
注册完成后,调用shedule方法,此方法有两个作用:1、完成Driver调度将waitingDrivers数组中的Driver发送到满足Worker上运行;2、Worker节点上为application启动Executor。每一次新的Driver的注册、application的注册或资源变动都将调用schedule方法。Schedule方法用于当前等待调度的application调度可用资源,在满足条件的Worker节点上启动Executor,调用startExecutorsOnWorkers方法完成操作(详细见Master-Driver和Worker-Executor剖析章节)。
启动Executor的方法startExecutorsOnWorkers是调用scheduleExecutorsOnWorkers方法,此方法有两种启动Executor策略:轮流均摊策略和依次全占策略,Master的scheduleExecutorsOnWorkers方法的源码:
private def scheduleExecutorsOnWorkers(
app: ApplicationInfo,
usableWorkers: Array[WorkerInfo],
spreadOutApps: Boolean): Array[Int] = {
val coresPerExecutor = app.desc.coresPerExecutor
val minCoresPerExecutor = coresPerExecutor.getOrElse(1)
val oneExecutorPerWorker = coresPerExecutor.isEmpty
val memoryPerExecutor = app.desc.memoryPerExecutorMB
val numUsable = usableWorkers.length
val assignedCores = new Array[Int](numUsable) // Number of cores to give to each worker
val assignedExecutors = new Array[Int](numUsable) // Number of new executors on each worker
var coresToAssign = math.min(app.coresLeft, usableWorkers.map(_.coresFree).sum)
/** Return whether the specified worker can launch an executor for this app. */
def canLaunchExecutor(pos: Int): Boolean = {
val keepScheduling = coresToAssign >= minCoresPerExecutor
val enoughCores = usableWorkers(pos).coresFree - assignedCores(pos) >= minCoresPerExecutor
val launchingNewExecutor = !oneExecutorPerWorker || assignedExecutors(pos) == 0
if (launchingNewExecutor) {
val assignedMemory = assignedExecutors(pos) * memoryPerExecutor
val enoughMemory = usableWorkers(pos).memoryFree - assignedMemory >= memoryPerExecutor
val underLimit = assignedExecutors.sum + app.executors.size < app.executorLimit
keepScheduling && enoughCores && enoughMemory && underLimit
} else {......}
scheduleExecutorOnWorkers方法为application分配好逻辑意义上的资源后,还不能真正在Worker节点为application分配出资源,当调用allocateWorkerResourceToExecutors方法时将会在Worker节点上真正分配资源。
private def allocateWorkerResourceToExecutors( app: ApplicationInfo,
......
val exec = app.addExecutor(worker, coresToAssign)
launchExecutor(worker, exec)
app.state = ApplicationState.RUNNING}
launchExecutor方法的源码:
private def launchExecutor(worker: WorkerInfo, exec: ExecutorDesc): Unit = {
logInfo("Launching executor " + exec.fullId + " on worker " + worker.id)
worker.addExecutor(exec)
worker.endpoint.send(LaunchExecutor(masterUrl,
exec.application.id, exec.id, exec.application.desc, exec.cores, exec.memory))
exec.application.driver.send(
ExecutorAdded(exec.id, worker.id, worker.hostPort, exec.cores, exec.memory))}
Worker收到LaunchExecutor(worker.endpoint.send())消息会相应的处理。匹配LaunchDriver成功构建DriverRunner对象,调用DriverRunner的start方法。在DriverRunner的start方法中调用fetchAndRunExecutor方法,此方法中的CommandUtils.buildProcessBuilder(appDesc.command...)传入的入口类是“org.apache.spark.executor.CoarseGrainedExecutorBackend”,当Worker节点中启动ExecutorRunner时,ExecutorRunner中会启动CoarseGrainedExecutorBackend进程,在CoarseGrainedExecutorBackend的onStart方法中,向Driver发出RegisterExecutor注册请求。Driver端收到注册请求,将会注册Executor的请求。CoarseGrainedSchedulerBackend的receiveAndReply方法的源码:
override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = {
case RegisterExecutor(executorId, executorRef, hostPort, cores, logUrls) =>
if (executorDataMap.contains(executorId)) { ......}
Driver向CoarseGrainedExecutorBackend发送RegisteredExecutor消息,CoarseGrainedExecutorBackend收到RrgisteredExecutor消息后将会新建一个Executor执行器,并为此Executor充当信使与Driver通信。CoarseGrainedExecutorBackend收到RegisteredExecutor消息的实现方法receive源码:
override def receive: PartialFunction[Any, Unit] = {
case RegisteredExecutor(hostname) =>
logInfo("Successfully registered with driver")
executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false)
......}
从CoarseGrainedExecutorBackend的receive方法中可知,CoarseGrainedExecutorBackend收到RegisteredExecutor消息后,创建一个org.apache.spark.executor.Executor对象,至此Executor创建完毕。