spark core源碼分析6 Spark job的提交

博客地址: http://blog.csdn.net/yueqian_zhu/

本節主要講解SparkContext的邏輯
首先看一個spark自帶的最簡單的例子:
object SparkPi {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("Spark Pi")
    val spark = new SparkContext(conf)
    val slices = if (args.length > 0) args(0).toInt else 2
    val n = math.min(100000L * slices, Int.MaxValue).toInt // avoid overflow
    val count = spark.parallelize(1 until n, slices).map { i =>
      val x = random * 2 - 1
      val y = random * 2 - 1
      if (x*x + y*y < 1) 1 else 0
    }.reduce(_ + _)
    println("Pi is roughly " + 4.0 * count / n)
    spark.stop()
  }
}
我們一般寫spark程序的流程與此類似。從這個簡單的程序中,逐步分析內部的原理。個人覺得這纔是spark最精髓的地方,至於之前的master,worker的啓動流程與一般的分佈式系統無太多差別。
首先創建SparkConf,加載一些spark的配置信息。
創建SparkContext,在創建SparkContext時可以指定preferredNodeLocationData,也可以不指定。
SparkContext創建的過程比較複雜,我們只介紹比較重要的對象及方法
1、listenerBus中可添加各種SparkListener監聽器,當任何SparkListenerEvent事件到來時,向所有註冊進來的監聽器發送事件
// An asynchronous listener bus for Spark events
private[spark] val listenerBus = new LiveListenerBus
2、persistentRdds用於緩存RDD在內存中
// Keeps track of all persisted RDDs
private[spark] val persistentRdds = new TimeStampedWeakValueHashMap[Int, RDD[_]]
3、創建SparkEnv -> 調用createDriverEnv
<pre name="code" class="java" style="font-size: 12pt; background-color: rgb(255, 255, 255);">// Create the Spark execution environment (cache, map output tracker, etc)
    _env = createSparkEnv(_conf, isLocal, listenerBus)
流程:1)創建driver的ActorRef,幷包裝在rpcEnv中
     2)創建mapOutputTracker,實際類型爲MapOutputTrackerMaster,用於跟蹤map output的信息。並將該對象註冊到MapOutputTrackerMasterEndpoint中。說明一下注冊的作用:註冊返回mapOutputTracker.trackerEndpoint(ActorRef類型),之後向該ActorRef發送消息會回調mapOutputTracker中的相關方法。比如發送AkkaMessage消息,會回調MapOutputTrackerMasterEndpoint的receiveAndReply或者receive方法。
    3)創建shuffleManager,默認是org.apache.spark.shuffle.hash.HashShuffleManager
    4)創建
shuffleMemoryManager
    5)創建
blockTransferService默認是
netty,shuffle時讀取塊的服務
    6)創建
blockManagerMaster,
負責記錄下所有BlockIds存儲在哪個Worker上
    7)創建
blockManager,提供真正的接口用於讀寫
    8)創建
cacheManager,它是依賴於blockManager的,RDD在進行計算的時候,通過CacheManager來獲取數據,並通過CacheManager來存儲計算結果
       9)創建
broadcastManager
   10)創建
httpFileServer,Driver和Executor在運行的時候都有可能存在第三方包依賴,
Driver比較簡單,spark-submit在提交的時候會指定所要依賴的jar文件從哪裏讀取;Executor由worker來啓動,worker需要下載Executor啓動時所需要的jar文件。爲了解決Executor啓動時依賴的Jar問題,Driver在啓動的時候要啓動HttpFileServer存儲第三方jar包,然後由worker從HttpFileServer來獲取。
   11)創建
outputCommitCoordinator
   12)創建
executorMemoryManager
   將上面的對象共同包裝成SparkEnv
4、創建_metadataCleaner,定期清理元數據信息
5、創建executorEnvs,Executor相關的配置
6、_heartbeatReceiver,用於接收Executor的心跳,同時,也會起一個定時器檢測Executor是否過期
7、調用
createTaskScheduler方法創建_taskScheduler和_schedulerBackend


  1)根據master來區分運行的邏輯,我們以standalone模式(spark://開頭)爲例講解
  2)taskscheduler實際創建的是TaskSchedulerImpl,backend實際是SparkDeploySchedulerBackend,SparkDeploySchedulerBackend本身拓展自CoarseGrainedSchedulerBackend。CoarseGrainedSchedulerBackend是一個基於Akka Actor實現的粗粒度的資源調度類,在整個SparkJob運行期間,CoarseGrainedSchedulerBackend會監聽並持有註冊給它的Executor資源,並且接收Executor註冊,狀態更新,響應Scheduler請求等,根據現有Executor資源發起任務調度流程。總之,兩者是互相協作,分工合作,共同完成整個任務調度的流程。
case SPARK_REGEX(sparkUrl) =>
        val scheduler = new TaskSchedulerImpl(sc)//任務相關的調度
        val masterUrls = sparkUrl.split(",").map("spark://" + _)
        val backend = new SparkDeploySchedulerBackend(scheduler, sc, masterUrls)
        scheduler.initialize(backend)
        (backend, scheduler)
3)scheduler的初始化
     這裏需要說明一下Pool的作用:每個SparkContext可能同時存在多個可運行的沒有依賴關係任務集,這些任務集之間如何調度,則是由pool來決定的,默認是FIFO,其他還有Fair調度器
def initialize(backend: SchedulerBackend) {
  this.backend = backend
  // temporarily set rootPool name to empty
  rootPool = new Pool("", schedulingMode, 0, 0)
  schedulableBuilder = {
    schedulingMode match {
      case SchedulingMode.FIFO =>
        new FIFOSchedulableBuilder(rootPool)
      case SchedulingMode.FAIR =>
        new FairSchedulableBuilder(rootPool, conf)
    }
  }
  schedulableBuilder.buildPools()
}
8、創建_dagScheduler,它是根據我們的程序來劃分stage,構建有依賴關係的任務集。DAGscheduler內部會開啓事件循環器,輪詢處理接收到的事件
9、調用_taskScheduler.start() -> backend.start(),創建driverEndpoint,用於向外界的交互,構建運行Executor所需要的環境,包括Appname,每個Executor上需要的cores、memory,classpath,jar以及參數,指定運行的類爲org.apache.spark.executor.CoarseGrainedExecutorBackend,封裝成ApplicationDescription。並將ApplicationDescription以及masters等封裝成AppClient,作爲App向masters提交的入口。
override def start() {
  super.start()
  //
  ...略
  //
  val command = Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",
    args, sc.executorEnvs, classPathEntries ++ testingClassPath, libraryPathEntries, javaOpts)
  val appUIAddress = sc.ui.map(_.appUIAddress).getOrElse("")
  val coresPerExecutor = conf.getOption("spark.executor.cores").map(_.toInt)
  val appDesc = new ApplicationDescription(sc.appName, maxCores, sc.executorMemory,
    command, appUIAddress, sc.eventLogDir, sc.eventLogCodec, coresPerExecutor)
  client = new AppClient(sc.env.actorSystem, masters, appDesc, this, conf)
  client.start()
  waitForRegistration()
}
查看client.start()內部,創建基ClientActor對象的ActorRef,繼續查看preStart() -> registerWithMaster
def tryRegisterAllMasters() {
      for (masterAkkaUrl <- masterAkkaUrls) {
        logInfo("Connecting to master " + masterAkkaUrl + "...")
        val actor = context.actorSelection(masterAkkaUrl)
        actor ! RegisterApplication(appDescription)
      }
    }
可以看到,其實只是向masters的actorRef的發送RegisterApplication消息。
我們繼續看master收到這個消息如何處理?
在主master收到後,保存app的詳細信息,創建appId,持久化app,並回饋RegisteredApplication消息,之後執行調度。調度流程在《spark core源碼分析2 master啓動流程》一節中已經介紹過了。
case RegisterApplication(description) => {
      if (state == RecoveryState.STANDBY) {
        // ignore, don't send response
      } else {
        logInfo("Registering app " + description.name)
        val app = createApplication(description, sender)
        registerApplication(app)//將app中的詳細信息保存在master的內存各種數據結構中
        logInfo("Registered app " + description.name + " with ID " + app.id)
        persistenceEngine.addApplication(app)//持久化app,用於主備切換時重構
        sender ! RegisteredApplication(app.id, masterUrl)
        schedule()//調度
      }
    }
AppClient收到RegisteredApplication消息後,確定主master,並設置app狀態爲已註冊,設置master傳回的AppId
case RegisteredApplication(appId_, masterUrl) =>
  appId = appId_
  registered = true
  changeMaster(masterUrl)
  listener.connected(appId)
在《spark core源碼分析2 master啓動流程》一節中,我們講了調度的master端的處理,當時還沒有app註冊上來,所以也就沒有向worker發送啓動Executor的命令。而此時我們已經註冊了一個App了,所以master調用launchExecutor(worker, exec),向worker發送LaunchExecutor消息。同時,也會向Appclient發送ExecutorAdded消息。
worker端收到後創建工作目錄,創建ExecutorRunner,ExecutorRunner啓動後單獨開闢一個線程處理,會根據之前包裝的command啓動一個進程,mainclass其實就是CoarseGrainedExecutorBackend,這些運行的參數等信息都已經被包含在appDesc中,由driver經master傳遞過來。處理完成之後,向master反饋ExecutorStateChanged消息
case LaunchExecutor(masterUrl, appId, execId, appDesc, cores_, memory_) =>
  if (masterUrl != activeMasterUrl) {
    logWarning("Invalid Master (" + masterUrl + ") attempted to launch executor.")
  } else {
    try {
      logInfo("Asked to launch executor %s/%d for %s".format(appId, execId, appDesc.name))

      // Create the executor's working directory
      val executorDir = new File(workDir, appId + "/" + execId)
      if (!executorDir.mkdirs()) {
        throw new IOException("Failed to create directory " + executorDir)
      }

      // Create local dirs for the executor. These are passed to the executor via the
      // SPARK_EXECUTOR_DIRS environment variable, and deleted by the Worker when the
      // application finishes.
      val appLocalDirs = appDirectories.get(appId).getOrElse {
        Utils.getOrCreateLocalRootDirs(conf).map { dir =>
          Utils.createDirectory(dir, namePrefix = "executor").getAbsolutePath()
        }.toSeq
      }
      appDirectories(appId) = appLocalDirs
      val manager = new ExecutorRunner(
        appId,
        execId,
        appDesc.copy(command = Worker.maybeUpdateSSLSettings(appDesc.command, conf)),
        cores_,
        memory_,
        self,
        workerId,
        host,
        webUi.boundPort,
        publicAddress,
        sparkHome,
        executorDir,
        akkaUrl,
        conf,
        appLocalDirs, ExecutorState.LOADING)
      executors(appId + "/" + execId) = manager
      manager.start()
      coresUsed += cores_
      memoryUsed += memory_
      master ! ExecutorStateChanged(appId, execId, manager.state, None, None)
    } catch {
      case e: Exception => {
        logError(s"Failed to launch executor $appId/$execId for ${appDesc.name}.", e)
        if (executors.contains(appId + "/" + execId)) {
          executors(appId + "/" + execId).kill()
          executors -= appId + "/" + execId
        }
        master ! ExecutorStateChanged(appId, execId, ExecutorState.FAILED,
          Some(e.toString), None)
      }
    }
  }
master收到消息後會根據Executor的狀態來區分。那哪些時候會收到這些消息呢?
  (1)當CoarseGrainedExecutorBackend進程退出後,會向master發送ExecutorStateChanged,狀態爲EXITED。
  (2)當AppClient收到ExecutorAdded消息後,會向master發送ExecutorStateChanged,狀態爲RUNNING
  (3)當ExecutorRunner啓動進程失敗時,會向master發送ExecutorStateChanged,狀態爲FAILED
關於CoarseGrainedExecutorBackend進程的啓動,即Executor的啓動,我們下節再講。真正的任務是運行在Executor中的,只有Executor進程正常啓動之後,才能運行被分配的任務。我們先介紹_taskScheduler.start()之後的邏輯。
10、下面主要就是初始化blockManager
_applicationId = _taskScheduler.applicationId()
    _applicationAttemptId = taskScheduler.applicationAttemptId()
    _conf.set("spark.app.id", _applicationId)
    _env.blockManager.initialize(_applicationId)
def initialize(appId: String): Unit = {
    blockTransferService.init(this)//讀取block
    shuffleClient.init(appId)//跟ShuffleServie有關,如果開關不打開,這裏不處理

    blockManagerId = BlockManagerId(
      executorId, blockTransferService.hostName, blockTransferService.port)//blockManager元信息

    shuffleServerId = if (externalShuffleServiceEnabled) {<span style="font-family: Menlo;">//跟ShuffleServie有關,暫時不介紹</span>
      BlockManagerId(executorId, blockTransferService.hostName, externalShuffleServicePort)
    } else {
      blockManagerId
    }
    //向driver註冊自己,註冊時攜帶了自身的ActorRef,Driver收到後會將blockManagerId及自身的ActorRef放入hashmap中保存起來。
    master.registerBlockManager(blockManagerId, maxMemory, slaveEndpoint)

    // Register Executors' configuration with the local shuffle service, if one should exist.
    if (externalShuffleServiceEnabled && !blockManagerId.isDriver) {
      registerWithExternalShuffleServer()
    }
  }


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章