Flink on Yarn啓動流程分析

Flink on Yarn啓動流程分析

本章簡單介紹一下Flink on Yarn的大體流程,以便更清晰的瞭解JobManager & TaskManager


FlinkOnYarn

Flink Cluster on Yarn啓動過程中,大體可以分爲二個階段
  1. Filnk Client發起請求,申請啓動Flink Cluster on Yarn
  2. Yarn RM接收請求,並指定NM分配Container啓動Flink Cluster

Flink Client發起請求

  • 安裝Flink:只需在一臺可以連接至Yarn & HDFS集羣的任意節點安裝即可
  • 啓動腳本(命令):./bin/yarn-session.sh -n {num} -jm {num} -tm {num}
  • 運行實例:yarn-session.sh中運行的最後命令是:java … org.apache.flink.yarn.cli.FlinkYarnSessionCli

簡單描述FlinkYarnSessionCli的主要內容

  1. 根據FLINK_CONF_DIR & (YARN_CONF_DIR | HADOOP_CONF_DIR) load相關配置
  2. 創建yarnClient,並申請一個applicationId
  3. 將Flink集羣運行所需要的Jar & Conf PUT至HDFS上
  4. 封裝ApplicationMaster啓動需要的Env & Cmd至Request對象中,並用yarnClient對象發起請求,等待相應
  5. 確認啓動成功後,將重要信息封裝成properties文件,並持久化至本地磁盤
注意事項:
  • 步驟三中的HDFS路徑,默認爲:/user/{user}/.flink/{applicationId}
    - 如果HDFS沒有爲該user創建 /user/{user} 目錄,將拋出異常
    - 由於該步驟中需要使用到applicationId,所以需要先通過yarnClient申請applicationId
  • 步驟四纔會真正的向Yarn申請資源運行ApplicationMaster
    - AM並不是Yarn的接口實現類,而是封裝至Context中的啓動命令 & 環境變量等相關信息
  • 啓動成功後生成的properties文件中 最重要的信息爲applicationId,將在之後的Flink Job提交時用於查找Cluster信息
    - properties文件持久化路徑,默認爲:/tmp/.yarn-properties-{user}
    - 如果在一個節點啓動多個Session,則需要注意這個文件位置(目前還未研究)

Yarn啓動AM

兩組件,三階段

  1. RM接收請求,並查詢可用NM,並使其啓動Container來運行AM
  2. NM接收調度,並依據信息相關信息,將Jar & Conf從HDFS下載至Local,同時還依據Cmd & Env在本地生成launcher腳本
  3. 通過運行launcher腳本,來啓動ApplicationMaster(從源碼中可以發現,Flink Client發送來的Cmd爲:java … YarnSessionClusterEntrypoint)

簡單描述FlinkSessionClusterEntrypoint的主要內容

  1. 啓動基於Akka的 RPC Service & Metric Register Service
  2. 啓動HA Service & Heartbeat Server
  3. 啓動BLOB Server & ArchivedExecutionGraphStore (會在Local創建臨時目錄用於存儲)
  4. 啓動Web Monitor Service(任務管理平臺)
  5. 啓動JobManager服務,用以管理TaskManager進程
注意事項:
  • 步驟二中用於存儲Jar & Conf以及launcher腳本的地址爲:/data/hadoop/yarn/local/usercache/{user}/appcache/application_{applicationId}/container_{applicationId}_…,其中包含一下內容
    - launch_container.sh – 啓動命令 & 環境變量
    - flink-conf.yaml & log配置文件 – 啓動配置 & 日誌配置
    - flink.jar & lib – 運行依賴Jar
  • 步驟三中運行YarnSessionClusterEntrypoint,以此來啓動JobManager,而後的TaskManager,則有JobManager來啓動並管理
    - 實際上,在on Yarn模式下,TaskManager的啓動 是推遲到了Filnk Job的調度發起的時候,並且,當一段時間沒有接收到Job時,TaskManager將自動退出,釋放資源

其他說明:

  • 日誌管理
    • FlinkYarnSessionCli的啓動,由Client發起,歸屬於Flink管理,所以日誌內容存儲在Flink安裝目錄的log/
    • YarnSessionClusterEntrypoint的啓動,又Yarn發起,歸屬於Yarn管理,所以日誌內容存儲在Yarn管理的目錄/data/hadoop/yarn/log/…
  • 進程管理
    • FlinkYarnSessionCli進程由Flink管理,YarnSessionClusterEntrypoint進程由Yarn管理
    • 當不通過FlinkYarnSessionCli來stop YarnSessionClusterEntrypoint時,需要使用yarn application -kill …,但是這種方式無法清理由FlinkYarnSessionCli管理和控制的資源,如:/tmp/.yarn-properties-{user}
    • 發起yarn application -kill …命令,請求停止Cluster時,會先停止TaskManager,然後停止JobManager,但是不會清理HDFS上的緩存
    • 通過FlinkYarnSessionCli的interact模式,可以對*/tmp/.yarn-properties-{user}* & HDFS緩存統一進行清理
  • Job提交
    • 這種模式下,Client將從本地查找/tmp/.yarn-properties-{user}配置,以獲取applicationId來定位Cluster,所以Job提交最好是在FlinkYarnSessionCli的啓動節點,否則需要指定applicationId
  • 集羣安裝
    • on Yarn模式下,Flink只需要安裝至 一個節點,因爲後續的進程,都會從HDFS上獲取Jar & Conf來進行啓動

官網文檔

The YARN client needs to access the Hadoop configuration to connect to the YARN resource manager and to HDFS. It determines the Hadoop configuration using the following strategy:

  1. Test if YARN_CONF_DIR, HADOOP_CONF_DIR or HADOOP_CONF_PATH are set (in that order). If one of these variables are set, they are used to read the configuration.
  2. If the above strategy fails (this should not be the case in a correct YARN setup), the client is using the HADOOP_HOME environment variable. If it is set, the client tries to access $HADOOP_HOME/etc/hadoop (Hadoop 2) and $HADOOP_HOME/conf (Hadoop 1).

When starting a new Flink YARN session, the client first checks if the requested resources (containers and memory) are available. After that, it uploads a jar that contains Flink and the configuration to HDFS (step 1).

The next step of the client is to request (step 2) a YARN container to start the ApplicationMaster (step 3). Since the client registered the configuration and jar-file as a resource for the container, the NodeManager of YARN running on that particular machine will take care of preparing the container (e.g. downloading the files). Once that has finished, the ApplicationMaster (AM) is started.

The JobManager and AM are running in the same container. Once they successfully started, the AM knows the address of the JobManager (its own host). It is generating a new Flink configuration file for the TaskManagers (so that they can connect to the JobManager). The file is also uploaded to HDFS. Additionally, the AM container is also serving Flink’s web interface. All ports the YARN code is allocating are ephemeral ports. This allows users to execute multiple Flink YARN sessions in parallel.

After that, the AM starts allocating the containers for Flink’s TaskManagers, which will download the jar file and the modified configuration from the HDFS. Once these steps are completed, Flink is set up and ready to accept Jobs.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章