spark core源碼分析6 Spark job的提交

博客地址: http://blog.csdn.net/yueqian_zhu/

本節主要講解SparkContext的邏輯

首先看一個spark自帶的最簡單的例子：

object SparkPi {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("Spark Pi")
    val spark = new SparkContext(conf)
    val slices = if (args.length > 0) args(0).toInt else 2
    val n = math.min(100000L * slices, Int.MaxValue).toInt // avoid overflow
    val count = spark.parallelize(1 until n, slices).map { i =>
      val x = random * 2 - 1
      val y = random * 2 - 1
      if (x*x + y*y < 1) 1 else 0
    }.reduce(_ + _)
    println("Pi is roughly " + 4.0 * count / n)
    spark.stop()
  }
}

我們一般寫spark程序的流程與此類似。從這個簡單的程序中，逐步分析內部的原理。個人覺得這纔是spark最精髓的地方，至於之前的master，worker的啓動流程與一般的分佈式系統無太多差別。

首先創建SparkConf，加載一些spark的配置信息。

創建SparkContext，在創建SparkContext時可以指定preferredNodeLocationData，也可以不指定。

SparkContext創建的過程比較複雜，我們只介紹比較重要的對象及方法

1、listenerBus中可添加各種SparkListener監聽器，當任何SparkListenerEvent事件到來時，向所有註冊進來的監聽器發送事件

// An asynchronous listener bus for Spark events
private[spark] val listenerBus = new LiveListenerBus

2、persistentRdds用於緩存RDD在內存中

// Keeps track of all persisted RDDs
private[spark] val persistentRdds = new TimeStampedWeakValueHashMap[Int, RDD[_]]

3、創建SparkEnv -> 調用createDriverEnv

<pre name="code" class="java" style="font-size: 12pt; background-color: rgb(255, 255, 255);">// Create the Spark execution environment (cache, map output tracker, etc)
    _env = createSparkEnv(_conf, isLocal, listenerBus)

流程：1）創建driver的ActorRef，幷包裝在rpcEnv中

     2）創建mapOutputTracker，實際類型爲MapOutputTrackerMaster，用於跟蹤map output的信息。並將該對象註冊到MapOutputTrackerMasterEndpoint中。說明一下注冊的作用：註冊返回mapOutputTracker.trackerEndpoint(ActorRef類型)，之後向該ActorRef發送消息會回調mapOutputTracker中的相關方法。比如發送AkkaMessage消息，會回調MapOutputTrackerMasterEndpoint的receiveAndReply或者receive方法。

    3）創建shuffleManager，默認是org.apache.spark.shuffle.hash.HashShuffleManager

    4）創建

shuffleMemoryManager

    5）創建

blockTransferService默認是

netty，shuffle時讀取塊的服務

    6）創建

blockManagerMaster，

負責記錄下所有BlockIds存儲在哪個Worker上

    7）創建

blockManager，提供真正的接口用於讀寫

    8）創建

cacheManager，它是依賴於blockManager的，RDD在進行計算的時候，通過CacheManager來獲取數據，並通過CacheManager來存儲計算結果

       9）創建

broadcastManager

   10）創建

httpFileServer,Driver和Executor在運行的時候都有可能存在第三方包依賴,

Driver比較簡單，spark-submit在提交的時候會指定所要依賴的jar文件從哪裏讀取;Executor由worker來啓動，worker需要下載Executor啓動時所需要的jar文件。爲了解決Executor啓動時依賴的Jar問題，Driver在啓動的時候要啓動HttpFileServer存儲第三方jar包，然後由worker從HttpFileServer來獲取。

   11）創建

outputCommitCoordinator

   12）創建

executorMemoryManager

   將上面的對象共同包裝成SparkEnv

4、創建_metadataCleaner，定期清理元數據信息

5、創建executorEnvs，Executor相關的配置

6、_heartbeatReceiver,用於接收Executor的心跳,同時，也會起一個定時器檢測Executor是否過期

7、調用

createTaskScheduler方法創建_taskScheduler和_schedulerBackend

  1）根據master來區分運行的邏輯，我們以standalone模式(spark://開頭)爲例講解

  2）taskscheduler實際創建的是TaskSchedulerImpl，backend實際是SparkDeploySchedulerBackend，而SparkDeploySchedulerBackend本身拓展自CoarseGrainedSchedulerBackend。CoarseGrainedSchedulerBackend是一個基於Akka Actor實現的粗粒度的資源調度類，在整個SparkJob運行期間，CoarseGrainedSchedulerBackend會監聽並持有註冊給它的Executor資源，並且接收Executor註冊，狀態更新，響應Scheduler請求等，根據現有Executor資源發起任務調度流程。總之，兩者是互相協作，分工合作，共同完成整個任務調度的流程。

case SPARK_REGEX(sparkUrl) =>
        val scheduler = new TaskSchedulerImpl(sc)//任務相關的調度
        val masterUrls = sparkUrl.split(",").map("spark://" + _)
        val backend = new SparkDeploySchedulerBackend(scheduler, sc, masterUrls)
        scheduler.initialize(backend)
        (backend, scheduler)

3）scheduler的初始化

     這裏需要說明一下Pool的作用：每個SparkContext可能同時存在多個可運行的沒有依賴關係任務集，這些任務集之間如何調度，則是由pool來決定的，默認是FIFO，其他還有Fair調度器

def initialize(backend: SchedulerBackend) {
  this.backend = backend
  // temporarily set rootPool name to empty
  rootPool = new Pool("", schedulingMode, 0, 0)
  schedulableBuilder = {
    schedulingMode match {
      case SchedulingMode.FIFO =>
        new FIFOSchedulableBuilder(rootPool)
      case SchedulingMode.FAIR =>
        new FairSchedulableBuilder(rootPool, conf)
    }
  }
  schedulableBuilder.buildPools()
}

8、創建_dagScheduler，它是根據我們的程序來劃分stage，構建有依賴關係的任務集。DAGscheduler內部會開啓事件循環器，輪詢處理接收到的事件

9、調用_taskScheduler.start() -> backend.start(),創建driverEndpoint，用於向外界的交互，構建運行Executor所需要的環境，包括Appname，每個Executor上需要的cores、memory，classpath，jar以及參數，指定運行的類爲org.apache.spark.executor.CoarseGrainedExecutorBackend，封裝成ApplicationDescription。並將ApplicationDescription以及masters等封裝成AppClient，作爲App向masters提交的入口。

override def start() {
  super.start()
  //
  ...略
  //
  val command = Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",
    args, sc.executorEnvs, classPathEntries ++ testingClassPath, libraryPathEntries, javaOpts)
  val appUIAddress = sc.ui.map(_.appUIAddress).getOrElse("")
  val coresPerExecutor = conf.getOption("spark.executor.cores").map(_.toInt)
  val appDesc = new ApplicationDescription(sc.appName, maxCores, sc.executorMemory,
    command, appUIAddress, sc.eventLogDir, sc.eventLogCodec, coresPerExecutor)
  client = new AppClient(sc.env.actorSystem, masters, appDesc, this, conf)
  client.start()
  waitForRegistration()
}

查看client.start()內部，創建基於ClientActor對象的ActorRef，繼續查看preStart() -> registerWithMaster

def tryRegisterAllMasters() {
      for (masterAkkaUrl <- masterAkkaUrls) {
        logInfo("Connecting to master " + masterAkkaUrl + "...")
        val actor = context.actorSelection(masterAkkaUrl)
        actor ! RegisterApplication(appDescription)
      }
    }

可以看到，其實只是向masters的actorRef的發送RegisterApplication消息。

我們繼續看master收到這個消息如何處理？

在主master收到後，保存app的詳細信息，創建appId，持久化app，並回饋RegisteredApplication消息，之後執行調度。調度流程在《spark core源碼分析2 master啓動流程》一節中已經介紹過了。

case RegisterApplication(description) => {
      if (state == RecoveryState.STANDBY) {
        // ignore, don't send response
      } else {
        logInfo("Registering app " + description.name)
        val app = createApplication(description, sender)
        registerApplication(app)//將app中的詳細信息保存在master的內存各種數據結構中
        logInfo("Registered app " + description.name + " with ID " + app.id)
        persistenceEngine.addApplication(app)//持久化app，用於主備切換時重構
        sender ! RegisteredApplication(app.id, masterUrl)
        schedule()//調度
      }
    }

AppClient收到RegisteredApplication消息後，確定主master，並設置app狀態爲已註冊，設置master傳回的AppId

case RegisteredApplication(appId_, masterUrl) =>
  appId = appId_
  registered = true
  changeMaster(masterUrl)
  listener.connected(appId)

在《spark core源碼分析2 master啓動流程》一節中，我們講了調度的master端的處理，當時還沒有app註冊上來，所以也就沒有向worker發送啓動Executor的命令。而此時我們已經註冊了一個App了，所以master調用launchExecutor(worker, exec)，向worker發送LaunchExecutor消息。同時，也會向Appclient發送ExecutorAdded消息。
worker端收到後創建工作目錄，創建ExecutorRunner，ExecutorRunner啓動後單獨開闢一個線程處理，會根據之前包裝的command啓動一個進程，mainclass其實就是CoarseGrainedExecutorBackend，這些運行的參數等信息都已經被包含在appDesc中，由driver經master傳遞過來。處理完成之後，向master反饋ExecutorStateChanged消息

case LaunchExecutor(masterUrl, appId, execId, appDesc, cores_, memory_) =>
  if (masterUrl != activeMasterUrl) {
    logWarning("Invalid Master (" + masterUrl + ") attempted to launch executor.")
  } else {
    try {
      logInfo("Asked to launch executor %s/%d for %s".format(appId, execId, appDesc.name))

      // Create the executor's working directory
      val executorDir = new File(workDir, appId + "/" + execId)
      if (!executorDir.mkdirs()) {
        throw new IOException("Failed to create directory " + executorDir)
      }

      // Create local dirs for the executor. These are passed to the executor via the
      // SPARK_EXECUTOR_DIRS environment variable, and deleted by the Worker when the
      // application finishes.
      val appLocalDirs = appDirectories.get(appId).getOrElse {
        Utils.getOrCreateLocalRootDirs(conf).map { dir =>
          Utils.createDirectory(dir, namePrefix = "executor").getAbsolutePath()
        }.toSeq
      }
      appDirectories(appId) = appLocalDirs
      val manager = new ExecutorRunner(
        appId,
        execId,
        appDesc.copy(command = Worker.maybeUpdateSSLSettings(appDesc.command, conf)),
        cores_,
        memory_,
        self,
        workerId,
        host,
        webUi.boundPort,
        publicAddress,
        sparkHome,
        executorDir,
        akkaUrl,
        conf,
        appLocalDirs, ExecutorState.LOADING)
      executors(appId + "/" + execId) = manager
      manager.start()
      coresUsed += cores_
      memoryUsed += memory_
      master ! ExecutorStateChanged(appId, execId, manager.state, None, None)
    } catch {
      case e: Exception => {
        logError(s"Failed to launch executor $appId/$execId for ${appDesc.name}.", e)
        if (executors.contains(appId + "/" + execId)) {
          executors(appId + "/" + execId).kill()
          executors -= appId + "/" + execId
        }
        master ! ExecutorStateChanged(appId, execId, ExecutorState.FAILED,
          Some(e.toString), None)
      }
    }
  }

master收到消息後會根據Executor的狀態來區分。那哪些時候會收到這些消息呢？

  （1）當CoarseGrainedExecutorBackend進程退出後，會向master發送ExecutorStateChanged，狀態爲EXITED。

  （2）當AppClient收到ExecutorAdded消息後，會向master發送ExecutorStateChanged，狀態爲RUNNING

  （3）當ExecutorRunner啓動進程失敗時，會向master發送ExecutorStateChanged，狀態爲FAILED

關於CoarseGrainedExecutorBackend進程的啓動，即Executor的啓動，我們下節再講。真正的任務是運行在Executor中的，只有Executor進程正常啓動之後，才能運行被分配的任務。我們先介紹_taskScheduler.start()之後的邏輯。

10、下面主要就是初始化blockManager

_applicationId = _taskScheduler.applicationId()
    _applicationAttemptId = taskScheduler.applicationAttemptId()
    _conf.set("spark.app.id", _applicationId)
    _env.blockManager.initialize(_applicationId)

def initialize(appId: String): Unit = {
    blockTransferService.init(this)//讀取block
    shuffleClient.init(appId)//跟ShuffleServie有關，如果開關不打開，這裏不處理

    blockManagerId = BlockManagerId(
      executorId, blockTransferService.hostName, blockTransferService.port)//blockManager元信息

    shuffleServerId = if (externalShuffleServiceEnabled) {<span style="font-family: Menlo;">//跟ShuffleServie有關,暫時不介紹</span>
      BlockManagerId(executorId, blockTransferService.hostName, externalShuffleServicePort)
    } else {
      blockManagerId
    }
    //向driver註冊自己，註冊時攜帶了自身的ActorRef，Driver收到後會將blockManagerId及自身的ActorRef放入hashmap中保存起來。
    master.registerBlockManager(blockManagerId, maxMemory, slaveEndpoint)

    // Register Executors' configuration with the local shuffle service, if one should exist.
    if (externalShuffleServiceEnabled && !blockManagerId.isDriver) {
      registerWithExternalShuffleServer()
    }
  }

spark core源碼分析6 Spark job的提交

spark core源碼分析13 異常情況下的容錯保證

spark core源碼分析12 spark緩存清理

spark core源碼分析7 Executor的運行

spark core源碼分析6 Spark job的提交

spark core源碼分析9 從簡單例子看action操作

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結