SchedulerBackend涉及到Netty的知識,筆者尚未理解,先寫一部分,後面會持續補充。
1 SchedulerBackend是什麼?
首先看SchedulerBackend在Spark中的使用。
如源碼1,最初,在SparkContext.scala中存在一個SchedulerBackend的實例,在createScheduler()方法中同時創建了兩個實例scheduler和backend,它們分別是TaskScheduler類、SchedulerBackend類的實例。
在不同的部署模式下,TaskScheduler的實現都相同,都是TaskSchedulerImpl,SchedulerBackend的實現不同,在這裏源碼中只展示了一種。
如源碼2,TaskScheduler通過start()方法啓動,而底層實際調用了Backend的start()方法。那麼爲什麼要調用Backend的start()方法呢?該方法有什麼功能呢?SchedulerBackend到底有什麼作用呢?現在前往第2章。
//源碼1,來自:SparkContext.scala
// Create and start the scheduler
val (sched, ts) = SparkContext.createTaskScheduler(this, master, deployMode)
_schedulerBackend = sched
_taskScheduler = ts
_dagScheduler = new DAGScheduler(this)
_heartbeatReceiver.ask[Boolean](TaskSchedulerIsSet)
// start TaskScheduler after taskScheduler sets DAGScheduler reference in DAGScheduler's
// constructor
_taskScheduler.start()
/**
* Create a task scheduler based on a given master URL.
* Return a 2-tuple of the scheduler backend and the task scheduler.
*/
private def createTaskScheduler(
sc: SparkContext,
master: String,
deployMode: String): (SchedulerBackend, TaskScheduler) = {
import SparkMasterRegex._
...
master match {
case "local" =>
val scheduler = new TaskSchedulerImpl(sc, MAX_LOCAL_TASK_FAILURES, isLocal = true)
val backend = new LocalSchedulerBackend(sc.getConf, scheduler, 1)
scheduler.initialize(backend)
(backend, scheduler)
...
}
...
}
//源碼2,來自TaskSchedulerImpl.scala
override def start() {
backend.start()
...
}
SchedulerBackend的作用
爲什麼要調用Backend的start()方法?該方法有什麼功能?SchedulerBackend到底有什麼作用?
首先來看SchedulerBackend的源碼,這是一個trait,會有不同的子類實現該trait,但所有子類的功能基本相同。即向當前等待分配計算資源的Task分配計算資源(即Executors),並在Executors上啓動Task,完成資源調度過程。
其中的start()、stop()、reviveOffers()等重要方法此處先不介紹,後面使用時逐個進行介紹。
/**
* A backend interface for scheduling systems that allows plugging in different ones under
* TaskSchedulerImpl. We assume a Mesos-like model where the application gets resource offers as
* machines become available and can launch tasks on them.
*/
private[spark] trait SchedulerBackend {
private val appId = "spark-application-" + System.currentTimeMillis
def start(): Unit
def stop(): Unit
def reviveOffers(): Unit
def defaultParallelism(): Int
/**
* Requests that an executor kills a running task.
*
* @param taskId Id of the task.
* @param executorId Id of the executor the task is running on.
* @param interruptThread Whether the executor should interrupt the task thread.
* @param reason The reason for the task kill.
*/
def killTask(
taskId: Long,
executorId: String,
interruptThread: Boolean,
reason: String): Unit =
throw new UnsupportedOperationException
def isReady(): Boolean = true
/**
* Get an application ID associated with the job.
*
* @return An application ID
*/
def applicationId(): String = appId
/**
* Get the attempt ID for this run, if the cluster manager supports multiple
* attempts. Applications run in client mode will not have attempt IDs.
*
* @return The application attempt id, if available.
*/
def applicationAttemptId(): Option[String] = None
/**
* Get the URLs for the driver logs. These URLs are used to display the links in the UI
* Executors tab for the driver.
* @return Map containing the log names and their respective URLs
*/
def getDriverLogUrls: Option[Map[String, String]] = None
/**
* Get the max number of tasks that can be concurrent launched currently.
* Note that please don't cache the value returned by this method, because the number can change
* due to add/remove executors.
*
* @return The max number of tasks that can be concurrent launched currently.
*/
def maxNumConcurrentTasks(): Int
}