Spark-executor

@(spark)[executor]

ExecutorExitCode

/**                                                                                                                                                                     
 * These are exit codes that executors should use to provide the master with information about                                                                          
 * executor failures assuming that cluster management framework can capture the exit codes (but                                                                         
 * perhaps not log files). The exit code constants here are chosen to be unlikely to conflict                                                                           
 * with "natural" exit statuses that may be caused by the JVM or user code. In particular,                                                                              
 * exit codes 128+ arise on some Unix-likes as a result of signals, and it appears that the                                                                             
 * OpenJDK JVM may use exit code 1 in some of its own "last chance" code.                                                                                               
 */                                                                                                                                                                     
private[spark]                                                                                                                                                          
object ExecutorExitCode {

ExecutorSource

主要就是一些metric

CoarseGrainedExecutorBackend

class CoarseGrainedExecutorBackend其實是個Actor，它是有main函數的：
1. 啓動一個叫做fetcher的actorSystem，從driver處獲取sparkConf
2. 關閉fetcher
3. createExecutorEnv即SparkEnv
4. 啓動CoarseGrainedExecutorBackend這個Actor
- 向driver註冊自己
- 等待收消息

  override def receiveWithLogging = {                                                                                                                                   
    case RegisteredExecutor =>                                                                                                                                          
      logInfo("Successfully registered with driver")                                                                                                                    
      val (hostname, _) = Utils.parseHostPort(hostPort)                                                                                                                 
      executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false)                                                                                

    case RegisterExecutorFailed(message) =>                                                                                                                             
      logError("Slave registration failed: " + message)                                                                                                                 
      System.exit(1)                                                                                                                                                    

    case LaunchTask(data) =>                                                                                                                                            
      if (executor == null) {                                                                                                                                           
        logError("Received LaunchTask command but executor was null")                                                                                                   
        System.exit(1)                                                                                                                                                  
      } else {                                                                                                                                                          
        val ser = env.closureSerializer.newInstance()                                                                                                                   
        val taskDesc = ser.deserialize[TaskDescription](data.value)                                                                                                     
        logInfo("Got assigned task " + taskDesc.taskId)                                                                                                                 
        executor.launchTask(this, taskId = taskDesc.taskId, attemptNumber = taskDesc.attemptNumber,                                                                     
          taskDesc.name, taskDesc.serializedTask)                                                                                                                       
      }                                                                                                                                                                 

    case KillTask(taskId, _, interruptThread) =>                                                                                                                        
      if (executor == null) {                                                                                                                                           
        logError("Received KillTask command but executor was null")                                                                                                     
        System.exit(1)                                                                                                                                                  
      } else {                                                                                                                                                          
        executor.killTask(taskId, interruptThread)                                                                                                                      
      }                                                                                                                                                                 

    case x: DisassociatedEvent =>                                                                                                                                       
      if (x.remoteAddress == driver.anchorPath.address) {                                                                                                               
        logError(s"Driver $x disassociated! Shutting down.")                                                                                                            
        System.exit(1)                                                                                                                                                  
      } else {                                                                                                                                                          
        logWarning(s"Received irrelevant DisassociatedEvent $x")                                                                                                        
      }                                                                                                                                                                 

    case StopExecutor =>                                                                                                                                                
      logInfo("Driver commanded a shutdown")                                                                                                                            
      executor.stop()                                                                                                                                                   
      context.stop(self)                                                                                                                                                
      context.system.shutdown()                                                                                                                                         
  }

值得關注的message其實是LaunchTask，它會調用executor.launchTask

Executor

/**                                                                                                                                                                     
 * Spark executor used with Mesos, YARN, and the standalone scheduler.                                                                                                  
 * In coarse-grained mode, an existing actor system is provided.                                                                                                        
 */                                                                                                                                                                     
private[spark] class Executor(                                                                                                                                          
    executorId: String,                                                                                                                                                 
    executorHostname: String,                                                                                                                                           
    env: SparkEnv,                                                                                                                                                      
    userClassPath: Seq[URL] = Nil,                                                                                                                                      
    isLocal: Boolean = false)                                                                                                                                           
  extends Logging

其中最重要的函數是:

   def launchTask(
      context: ExecutorBackend,
      taskId: Long,
      attemptNumber: Int,
      taskName: String,
      serializedTask: ByteBuffer) {
    val tr = new TaskRunner(context, taskId = taskId, attemptNumber = attemptNumber, taskName,
      serializedTask)
    runningTasks.put(taskId, tr)
    threadPool.execute(tr)
  }

顯然這是一個基於線程池的異步執行：TaskRunner的Run函數邏輯如下：
1. 調用 Task.deserializeWithDependencies，得到依賴的文件和jar
2. 根據cache的狀況決定是get文件還是用cache的文件；文件是否更新是由filename和timestamp共同決定的
3. 反序列化，得到真正的task
4. call task.run真正去執行task
5. 根據結果大小決定是把結果直接返回還是寫入blockManager
6. 通過execBackEnd的statusUpdate把結果返回driver

從上面的流程可以看出：在整個過程中有下面幾種數據流動：
1. taskFiles和taskJar的url
2. taskFiles和taskJars的文件，如果有cache的話，可以沒有
3. Task的序列化byte
4. 結果（block地址或者一個比較小的結果）

即一個task的網絡傳輸不會很多。

task的真正執行過程和task的調度在scheduler中。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Spark-executor

Spark-executor

ExecutorExitCode

ExecutorSource

CoarseGrainedExecutorBackend

Executor

redis的key亂碼問題和值自增問題

CORS error 但是 status code 是200 OK

一個開源且全面的C#算法實戰教程

一款.NET開源、功能強大、跨平臺的繪圖庫 - OxyPlot

壓縮上傳的GPU數據的方案

OpenTelemetry 實踐指南：歷史、架構與基本概念

需求管理祕籍：從混亂到有序，讓你的項目高效運轉

使用skopeo同步鏡像

用光線投射法渲染規則模型

Spark-shuffle

spark-broadcast

Document數據庫 VS 關係數據庫

spark-sql-catalyst

Postgresql-xl 調研

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結