【Flink系列十八】History Server 重新登場，如何跟Yarn進行集成

原創

2023-05-31 13:13

先看Flink的官方文檔

本文適用於Flink-1.11+

History Server 至少Flink-1.16+

JobManager

The archiving of completed jobs happens on the JobManager, which uploads the archived job information to a file system directory. You can configure the directory to archive completed jobs in flink-conf.yaml by setting a directory via jobmanager.archive.fs.dir.

# Directory to upload completed job information
jobmanager.archive.fs.dir: hdfs:///completed-jobs

HistoryServer

The HistoryServer can be configured to monitor a comma-separated list of directories in via historyserver.archive.fs.dir. The configured directories are regularly polled for new archives; the polling interval can be configured via historyserver.archive.fs.refresh-interval.

# Monitor the following directories for completed jobs
historyserver.archive.fs.dir: hdfs:///completed-jobs

# Refresh every 10 seconds
historyserver.archive.fs.refresh-interval: 10000
The contained archives are downloaded and cached in the local filesystem. The local directory for this is configured via historyserver.web.tmpdir.

Check out the configuration page for a complete list of configuration options.

Log Integration

Flink does not provide built-in methods for archiving logs of completed jobs. However, if you already have log archiving and browsing services, you can configure HistoryServer to integrate them (via historyserver.log.jobmanager.url-pattern and historyserver.log.taskmanager.url-pattern). In this way, you can directly link from HistoryServer WebUI to logs of the relevant JobManager / TaskManagers.

# HistoryServer will replace <jobid> with the relevant job id
historyserver.log.jobmanager.url-pattern: http://my.log-browsing.url/<jobid>

# HistoryServer will replace <jobid> and <tmid> with the relevant job id and taskmanager id
historyserver.log.taskmanager.url-pattern: http://my.log-browsing.url/<jobid>/<tmid>

集成方案

日誌集成部分說明了，flink的history ui 提供兩種URL鏈接，可以不修改源碼的方式直接訪問Yarn的日誌。

那麼在現有的實時計算平臺，直接實現一個地址轉換器是成本最低，維護最簡單的方案。

這裏說明一下，以下解決方案僅供參考
這裏說明一下，以下解決方案僅供參考
這裏說明一下，以下解決方案僅供參考

如何獲取 JobManager 日誌鏈接

例如 http://flink.slankka.com/<jobId> ，可以根據jobId查找作業的實例歷史記錄，找到對應的applicationId，接着查詢Yarn Rest API，獲取，拼接出Yarn的JobManager的URL。

Yarn Rest API /ws/v1/cluster/apps/{appid} ，日誌的URL就在在返回值內JSONPath:app/amContainerLogs。

如何獲取TaskManager 日誌鏈接

http://flink.slankka.com/<jobId>/<tmId>，則有些不同：

通過History UI的Restapi, /jobs/{jobid}，獲得 /vertices, 得到vertice ID
通過History UI的Restapi, /jobs/{jobid}/vertices/{vertexid}/taskmanagers，獲得TaskManager的數值。
通過taskmanager-id獲得 NodeManager的短名稱
短名稱拼上Yarn的完整Server域名

舉一個例子

這裏是一個TaskManager的Host，它不完整，但是和Yarn的域名前綴是吻合的。
因此拼接出: ddn130160.yarn.slankka.com 即可。

最終的URL地址例子：

http://hist.yarn.slankka.com:19888/jobhistory/logs/ddn130160.yarn.slankka.com:8041/container_e15_1665284980006_8340_01_000002/container_e15_1665284980006_8340_01_000002/slankka

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【Flink系列十八】History Server 重新登場，如何跟Yarn進行集成

先看Flink的官方文檔

JobManager

HistoryServer

Log Integration

集成方案

如何獲取 JobManager 日誌鏈接

如何獲取TaskManager 日誌鏈接

舉一個例子

HTML頁面關於高分屏的設置

北歐瑞典挪威芬蘭瑞士TikTok海外網紅與YouTube博主的合作模式

歐洲英國德國法國TikTok與YouTube海外網紅達人的完美合作策略

druid數據源 xml配置

[Dubbo] Dubbo 反序列化將Pair轉化成HashMap的問題

龍年-數據庫時間類型字段精度問題 datetime類型的精度

龍年-2月學習到的新知識

Github 解決 Recv failure: Connection reset by peer

【Flink系列二十一】深入理解 JVM的類型加載約束，解決 Flink 類型加載衝突問題的通用方法

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結