Spark動態資源分配的資源釋放過程及BlockManager清理過程

原創

2020-06-24 07:07

文章目錄

Spark動態資源分配過程中YarnScheduler 釋放資源過程

BlockManager 清理Broadcast過程

Spark動態資源分配過程中YarnScheduler 釋放資源過程

SchedulerBackend， TaskScheduler 和 ExecutorAllocationManager 的創建

val (_schedulerBackend, _taskScheduler) = SparkContext.createTaskScheduler( sc: SparkContext, master: String, deployMode: String): (SchedulerBackend, TaskScheduler)– 根據給定的master URL創建一個task scheduler

通過ServiceLoader.load(classOf[ExternalClusterManager], loader).asScala.filter(_.canCreate(url))服務加載所有註冊的ExternalClusterManager,參考 META-INF/services 目錄下的org.apache.spark.scheduler.ExternalClusterManager文件，這裏註冊了YarnClusterManager，且通過canCreate(url) 方法返回唯一的Yarn調度器。

org.apache.spark.scheduler.ExternalClusterManager
  org.apache.spark.scheduler.cluster.YarnClusterManager

通過 YarnClusterManager 生成的_schedulerBackend = YarnClusterSchedulerBackend， scheduler = YarnScheduler。YarnClusterSchedulerBackend 和 YarnScheduler類繼承關係如下:

YarnClientSchedulerBackend
  YarnSchedulerBackend : Yarn的資源管理事件都在這裏， RegisterClusterManager, RemoveExecutor, RequestExecutors, KillExecutors等
    CoarseGrainedSchedulerBackend
      ExecutorAllocationClient

YarnScheduler
  TaskSchedulerImpl
    TaskScheduler

SparkContext 在初始化的時候如果開啓動態資源分配，會實例化一個 ExecutorAllocationManager() 並start。在ExecutorAllocationManager內部會調用上面的YarnClusterSchedulerBackend（就是後面的client）來進行實際的調度。

_executorAllocationManager = Some(new ExecutorAllocationManager(schedulerBackend.asInstanceOf[ExecutorAllocationClient], listenerBus, _conf, cleaner = cleaner))

ExecutorAllocationManager

ExecutorAllocationManager 服務起來後會啓動一個後臺線程循環調度，executorMonitor 會把超時的Executor list去取出來，並調用 removeExecutors()進行executor資源釋放。

def start(): Unit
  def schedule(): Unit
    val executorIdsToBeRemoved = executorMonitor.timedOutExecutors()
    def removeExecutors(executors: Seq[String])
      client.killExecutors(executorIdsToBeRemoved, adjustTargetNumExecutors = false, countFailures = false, force = false)

上面調用client: ExecutorAllocationClient的killExecutors方法。client實際上就是我們之前看到的class CoarseGrainedSchedulerBackend extends ExecutorAllocationClient

CoarseGrainedSchedulerBackend

def killExecutors()
doKillExecutors(executorsToKill)
  YarnSchedulerBackend: 
    def doKillExecutors(executorIds: Seq[String])
    yarnSchedulerEndpointRef.ask[Boolean](KillExecutors(executorIds))

o.a.s.scheduler.cluster.YarnSchedulerEndpoint

YarnSchedulerEndpoint 直接將 KillExecutors 的請求轉發給 AMEndpoint var amEndpoint: Option[RpcEndpointRef]

o.a.s.deploy.yarn.ApplicationMaster

ApplicationMaster 開始處理 KillExecutors 事件

case KillExecutors(executorIds) 
  YarnAllocator: 
  def killExecutor(executorId: String)
  internalReleaseContainer(container)
  amClient.releaseAssignedContainer(container.getId()) -- 主動向RM申請釋放資源

BlockManager 清理Broadcast過程

BlockManager 管理的相關Relaition

SparkEnv
 - BlockManager
    - blockManagerMaster : 內部有兩個EndpointRef,分別用於處理BlockManagerInfo 的 RPC事件和心跳事件， 管理 BlockManagerInfo 列表.  
      - driverEndpoint: BlockManagerMasterEndpoint : 爲了管理所有BlockManagerInfo 內部維護了一個 blockManagerInfo: Map[BlockManagerId, BlockManagerInfo]
        - BlockManagerId 可以理解爲由 executorId, host, port, topologyInfo 四個字段組成的一個標識符
        - BlockManagerInfo : 在Driver和Executor上都有，內部維護當前有的 blocks: JHashMap[BlockId, BlockStatus]信息，用於內存管理。當removeBlock時，會做標記刪除，但是使用的進程內存 _remainingMem 不釋放。
      - driverHeartbeatEndPoint: BlockManagerMasterHeartbeatEndpoint 日常心跳管理，忽略

val DRIVER_ENDPOINT_NAME = "BlockManagerMaster"
val DRIVER_HEARTBEAT_ENDPOINT_NAME = "BlockManagerMasterHeartbeat"

## Executor 端BlockManager的初始化和註冊

BlockManagerInfo 在 Executor 實例化的時候通過發送 `RegisterBlockManager` 事件到 Driver Endpoint 進行註冊

```java
CoarseGrainedExecutorBackend
EVENT: case RegisteredExecutor =>
  executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false, resources = _resources)

Executor() 構造方法
  env.blockManager.initialize(conf.getAppId)  
    val id = BlockManagerId(executorId, blockTransferService.hostName, blockTransferService.port, None)
    val idFromMaster = master.registerBlockManager(id, ...* )
      val updatedId = driverEndpoint.askSync[BlockManagerId](RegisterBlockManager(id, localDirs, maxOnHeapMemSize, maxOffHeapMemSize, slaveEndpoint)) -- 向 driverEndpoint 發送註冊事件

RemoveBroadcast 管理

當我們需要對集羣Block進行管理的時候，只需要調用BlockManager中的master引用即可。例如：SparkEnv.get.blockManager.master.removeBroadcast(id, removeFromDriver, blocking)

BlockManagerMaster:
def removeBroadcast(broadcastId: Long, removeFromMaster: Boolean, blocking: Boolean): Unit -- master對象只是接口，將實際請求轉給 driverEndpoint
val future = driverEndpoint.askSync[Future[Seq[Int]]](RemoveBroadcast(broadcastId, removeFromMaster))


driverEndpoint: BlockManagerMasterEndpoint
def removeBroadcast(broadcastId: Long, removeFromDriver: Boolean)
  -- 開始構造 新的RemoveBroadcast 事件，由Driver發送到各個Executor上的BlockManager
  val removeMsg = RemoveBroadcast(broadcastId, removeFromDriver)
  requiredBlockManagers.map { bm => bm.slaveEndpoint.ask[Int](removeMsg) }

實際代碼舉例

val bcNewCenters = data.context.broadcast(newCenters) -- 創建broadcast對象

bcNewCenters.unpersist() --調用unpersist() 方法
Broadcast.scala
def unpersist(): Unit
def unpersist(blocking: Boolean): Unit
TorrentBroadcast.scala : def doUnpersist(blocking: Boolean): Unit
def unpersist(id: Long, removeFromDriver: Boolean, blocking: Boolean): Unit
SparkEnv.get.blockManager.master.removeBroadcast(id, removeFromDriver, blocking)

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Spark動態資源分配的資源釋放過程及BlockManager清理過程

文章目錄

Spark動態資源分配過程中YarnScheduler 釋放資源過程

SchedulerBackend， TaskScheduler 和 ExecutorAllocationManager 的創建

ExecutorAllocationManager

CoarseGrainedSchedulerBackend

o.a.s.scheduler.cluster.YarnSchedulerEndpoint

o.a.s.deploy.yarn.ApplicationMaster

BlockManager 清理Broadcast過程

BlockManager 管理的相關Relaition

RemoveBroadcast 管理

實際代碼舉例

Spark 3.0 測試與使用

Spark動態資源分配的資源釋放過程及BlockManager清理過程

Uber jvm-profiler學習

Mysql 常用操作及mysql8 遇到的問題記錄

實現自定義Spark優化規則

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結