SparkCore — Executor註冊

Executor註冊

  註冊的機制流程如下,CoarseGrainedExecutorBackend進程啓動之後,會立即向Driver發送消息註冊RegisterExecutor消息,Driver註冊成功之後,會返回RegisteredExecutor消息。然後創建管理啓動Task的句柄,將Task封裝在TaskRunner線程中,並將其放入線程池運行。
executor註冊
  直接看源碼,如下所示:

override def onStart() {
    logInfo("Connecting to driver: " + driverUrl)
    rpcEnv.asyncSetupEndpointRefByURI(driverUrl).flatMap { ref =>
      // This is a very fast action so we can use "ThreadUtils.sameThread"
      driver = Some(ref)
      // 向Driver註冊executor
      ref.ask[RegisterExecutorResponse](
        RegisterExecutor(executorId, self, hostPort, cores, extractLogUrls))
    }(ThreadUtils.sameThread).onComplete {
      // This is a very fast action so we can use "ThreadUtils.sameThread"
      // 省略代碼 ....
      }
    }(ThreadUtils.sameThread)
  }

  executor啓動之後,首先會向Driver註冊,發送RegisterExecutor消息,接着Driver接收到消息之後,將Executor的註冊信息保存,然後發送RegisteredExecutor消息給Executor,如下所示:

 override def receive: PartialFunction[Any, Unit] = {
    // Driver註冊Executor成功之後,會發送回來RegisteredExecutor消息
    case RegisteredExecutor(hostname) =>
      logInfo("Successfully registered with driver")
      // 創建Executor對象,作爲執行句柄,它的大部分功能都是通過executor實現的
      executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false)

    case RegisterExecutorFailed(message) =>
      logError("Slave registration failed: " + message)
      System.exit(1)

      // 啓動Task
    case LaunchTask(data) =>
      if (executor == null) {
        logError("Received LaunchTask command but executor was null")
        System.exit(1)
      } else {
        // 將Task反序列化
        val taskDesc = ser.deserialize[TaskDescription](data.value)
        logInfo("Got assigned task " + taskDesc.taskId)
        // 用executor執行句柄調用launchTask,來執行task
        executor.launchTask(this, taskId = taskDesc.taskId, attemptNumber = taskDesc.attemptNumber,
          taskDesc.name, taskDesc.serializedTask)
      }
 	// 省略代碼.................
  }

  executor在接收到Driver發送的RegisteredExecutor消息之後,就會創建一個executor句柄。然後Executor在接收到來自Driver內部的TaskScheduler發送的LaunchTask消息,會負責啓動Task,首先將task反序列化,接着調用launchTask啓動task。下面看一下launchTask方法()。

def launchTask(
      context: ExecutorBackend,
      taskId: Long,
      attemptNumber: Int,
      taskName: String,
      serializedTask: ByteBuffer): Unit = {
    // 將對於每一個task,都會創建一個TaskRunner,TaskRunner繼承的是Java的多線程中的Runnable接口
    val tr = new TaskRunner(context, taskId = taskId, attemptNumber = attemptNumber, taskName,
      serializedTask)
    // 將TaskRunner放入內存緩存
    runningTasks.put(taskId, tr)
    // 從線程池中取出一個線程,執行task。
    // 線程池自動實現了排隊機制,也就是說,如果線程池內的線程暫時沒有空閒的話,
    // 那麼後續進來的線程需要排隊
    threadPool.execute(tr)
  }

  在launchTask中最重要的就是創建TaskRuuner,它是一個線程,裏面封裝了task的一些信息,將其加入runningTasks緩存,最後放入線程池中去運行。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章