Flink Checkpoint 和 Large State 調優

原創

2021-06-22 11:43

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"Overview","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了使 Flink 應用程序能夠可靠地大規模運行，必須滿足兩個條件：","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"應用程序需要能夠可靠地獲取 Checkpoint","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在發生故障後，需要足夠的資源追上（catch up）輸入數據流","attrs":{}}]}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"監控 State 和 Checkpoint","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"監控 Checkpoint 行爲的最簡單方法是通過 WebUI 界面。有兩個 ","attrs":{}},{"type":"link","attrs":{"href":"https://xie.infoq.cn/article/08733fa1ab3c2f656826fa659","title":"","type":null},"content":[{"type":"text","text":"Checkpoint Metric","attrs":{}}]},{"type":"text","text":" 最值得關注的是：","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當觸發 checkpoint 的時間一直很高時，Operator 收到第一個 checkpoint barrier 的時間一直很高，這意味着 checkpoint barriers 需要很長時間才能從 Source 到 Operator。這通常表明系統在恆定背壓（backpressure）下工作。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對齊持續時間。在 Exactly-once 語義下，有多個輸入的 Operator，已經接收到 barrier 的通道將被阻止接收進一步的數據，直到所有剩餘的通道趕上並接收到它們的 barrier 的持續時間。","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"理想情況下，這兩個值都應該是低值，持續出現較高的值意味着 checkpoint barrier 在 job graph 中緩慢移動，通常是由於 backpressure 存在（沒有足夠的資源來處理記錄）。也可以通過增加處理記錄的端到端延遲來觀察。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"調整 Checkpoint","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"應用程序可以配置固定時間間隔觸發 checkpoint。當一個 checkpoint 的完成時間長於固定間隔時，在進行中的 checkpoint 完成之前不會觸發下一個（默認情況下，下一個 checkpoint 將在正在進行的 checkpoint 完成後立即觸發）。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當 checkpoint 結束的時間經常超過固定間隔時，系統會不斷地觸發 checkpoint（完成後立即啓動新）。這可能意味着在兩個 checkpoint 之間，Operator 處理進展過少，並且 checkpoint 佔用了過多的資源。此行爲對使用異步 checkpoint 的流應用程序的影響較小，但仍可能對整體應用程序性能產生影響。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了防止這種情況，應用程序可以定義一個 checkpoint 的最小間隔（在最新 checkpoint 結束和下一個 checkpoint 開始前必須經過的最小時間間隔。）：","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"StreamExecutionEnvironment.getCheckpointConfig()\n .setMinPauseBetweenCheckpoints(milliseconds)","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下圖說明了這是如何影響 checkpoint 的，避免了 checkpoint 持續不斷的進行。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/8a/8ac4f3d7f24868a9eb33e2aa5bb2cf70.png","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以配置應用程序允許同時進行多個 checkpoint。當手動觸發 savepoint 時，可能與正在進行的 checkpoint 同時進行。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"調整 RocksDB","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"許多大規模 Flink 流計算應用程序的 State 存儲使用的是 RocksDB state Backend。擴展性遠遠超過主內存，並可靠地存儲大的 ","attrs":{}},{"type":"link","attrs":{"href":"https://xie.infoq.cn/article/d5eed8ecec7ec859c5be5bd93","title":"","type":null},"content":[{"type":"text","text":"keyed state","attrs":{}}]},{"type":"text","text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RocksDB 的性能會因配置而異，下面介紹一些使用 RocksDB state Backend 的最佳實踐。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"增量 Checkpoint","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在減少 checkpoint 所需時間方面，開啓增量 checkpoint 應該是首要考慮因素之一。與完全 checkpoint 相比，增量 checkpoint 可以顯著減少時間，因爲只記錄與前一次完成的 checkpoint 相比所做的更改。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Timer 存儲選擇","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"定時器（Timer）默人存儲在 RocksDB 中，當 Job 只有很少的 Timer 時，放在堆上存儲可以提高性能。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"請小心使用此功能，因爲基於堆的 Timer 可能會增加 checkpoint 時間，並且無法在內存之外擴展。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"調整 RocksDB 內存","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RocksDB State Backend 的性能在很大程度上取決於其可用的內存量。爲了提高性能，增加內存會有很大幫助，或者調整內存使用。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"默認情況，RocksDB State Backend 使用 Flink 託管內存用於 RocksDBs buffer 和 cache（","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"state.backend.rocksdb.memory.managed: true","attrs":{}}],"attrs":{}},{"type":"text","text":"）。要調整與內存相關的性能問題，以下步驟可能會有所幫助：","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"增加託管內存的大小，這通常會改善很多情況，並且不會增加調優 RocksDB 底層配置的複雜性。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":"none"},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"特別是對於大 Container/進程大小，除非應用程序邏輯本身需要大量 JVM 堆內存，否則總內存中的大部分通常都可以放到 RocksDB 使用（默認的託管內存比例 0.4 是保守的）。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"RocksDB 中 write buffer 的數量取決於應用程序中的 State 數量。每個 State 對應一個 ColumnFamily（需要獨立的 write buffer）。因此，具有大量 State 的應用程序通常需要更多內存才能獲得相同的性能。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過設置 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"state.backend.RocksDB.memory.managed:false","attrs":{}}],"attrs":{}},{"type":"text","text":"，可以嘗試比較 RocksDB with managed memory 和 RocksDB with per column family memory 的性能。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":"none"},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不使用託管內存意味着 RocksDB 按照應用程序中的 State 數量按比例分配內存。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果應用程序有大量狀態，並且頻繁的 MemTable 刷新（寫入端瓶頸），如果不能提供更多內存，那麼可以增加進入寫入緩衝區的內存比率（","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"state.backend.rocksdb.memory.write buffer ratio","attrs":{}}],"attrs":{}},{"type":"text","text":"）。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一個高級選項（面向 RocksDB 專家）可以減少具有許多狀態的設置中的 MemTable 刷新次數，是通過 RocksDBOptionsFactory 調整 RocksDB 的 Columnfamily 設置","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"public class MyOptionsFactory implements ConfigurableRocksDBOptionsFactory {\n\n @Override\n public DBOptions createDBOptions(DBOptions currentOptions, Collection handlesToClose) {\n // 當一個 Operator 中有多個狀態時，增加後臺最大刷新線程數\n // 這意味着在一個 RocksDB 實例中會有多個 Columnfamily\n return currentOptions.setMaxBackgroundFlushes(4);\n }\n\n @Override\n public ColumnFamilyOptions createColumnOptions(\n ColumnFamilyOptions currentOptions, Collection handlesToClose) {\n // 將 arena 塊大小從默認的8MB減少到1MB。\n return currentOptions.setArenaBlockSize(1024 * 1024);\n }\n\n @Override\n public OptionsFactory configure(Configuration configuration) {\n return this;\n }\n}","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"容量規劃","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本節討論如何決定一個 Flink 作業應該使用多少資源才能可靠地運行。容量規劃的基本經驗法則是：","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"正常操作應具有足夠的容量，以避免在恆定背壓下操作。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在常規無背壓運行程序所需的資源之上提供一些額外的資源。用來在應用程序恢復時快速處理恢復期間積累的輸入數據，這取決於恢復操作通常需要多長時間（取決於故障轉移時需要加載到新 TaskManager 中的狀態的大小）以及要求故障恢復的速度。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"暫時的背壓通常是可以接受的，在負載峯值期間、Catchup 階段或外部系統出現臨時響應慢時。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"某些操作（如大型窗口）會導致其下游操作符的負載存在毛刺（spiky）：在構建窗口時，下游 Operator 可能是空閒的，在發出窗口數據時，下游纔開始工作。下游並行性的規劃需要考慮窗口發出的量以及處理這種峯值的速度。","attrs":{}}]}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"壓縮","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Flink 爲所有 checkpoint 和 savepoint 提供可選的壓縮（默認值：off）。目前，壓縮總是使用 ","attrs":{}},{"type":"link","attrs":{"href":"https://github.com/xerial/snappy-java","title":"","type":null},"content":[{"type":"text","text":"snappy compression algorithm（version 1.1.4）","attrs":{}}]},{"type":"text","text":" 但計劃在未來支持自定義壓縮算法。壓縮的粒度是 keyed state 的 key-group，每個 key-group 可以單獨壓縮，這對於縮放程序非常重要。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"壓縮可以通過 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"ExecutionConfig","attrs":{}}],"attrs":{}},{"type":"text","text":" 開啓","attrs":{}}]},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"ExecutionConfig executionConfig = new ExecutionConfig();\nexecutionConfig.setUseSnapshotCompression(true);","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"壓縮選項對增量快照（RocksDB）沒有影響。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"任務本地恢復","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Motivation","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 Flink 的 checkpoint 中，每個 Task 都會生成一個 State snapshot，然後將其寫入分佈式存儲。每個 Task 通過發送一個描述 State 在分佈式存儲中的位置的句柄來確認 State 成功寫入 JobManager。JobManager 依次從所有 Task 收集句柄，並將綁定到到 checkpoint 對象中。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在恢復的情況下，JobManager 打開最新的 checkpoint 對象並將句柄發送回相應的 Task，然後這些 Task 可以從分佈式存儲中恢復 State。使用分佈式存儲來存儲 State 有兩個重要的優點。首先，存儲是容錯的，其次，分佈式存儲中的所有 State 對所有節點都是可訪問的，並且可以很容易地重新分配（例如，用於重新縮放）。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"然而，使用遠程分佈式存儲也有一個很大的缺點：所有 Task 都必須通過網絡從遠程位置讀取其狀態。在一些情況下，恢復可以將 Task 重新安排到與上一次運行相同的 TaskManager 中，但仍然要讀取遠程狀態。這可能會導致大狀態的恢復時間長。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Approach","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"任務本地 State 恢復是針對這一類問題，主要思想如下：對於每個 checkpoint，每個 Task 不僅將 State snapshot 寫入分佈式存儲，而且還將 state snapshot 的輔助副本保存在該 Task 所在的本地存儲中（例如，本地磁盤或內存中）。State 的主存儲必須仍然是分佈式存儲，因爲本地存儲不能確保節點故障下的持久性，也不能爲其他節點提供重新分發 State 的訪問。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於每個可以重新安排到上一個位置進行恢復的 Task，可以從本地輔助副本恢復 State，並避免遠程讀取的開銷。考慮到許多故障不是節點故障，節點故障通常一次隻影響一個或極少數節點，在恢復過程中，大多數 Task 很可能返回到其以前的位置，並發現其本地 State 完好無損，可以有效地縮短恢復時間。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"需要注意的是，根據所選的 state backend 和 checkpoint 策略，在創建和存儲本地輔助副本時，每個 checkpoint 可能需要一些額外的成本。在大多數情況下，實現只需將對分佈式存儲的寫入複製到本地文件。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e8/e8f701752d1a057cc2fb64bc6a8a8088.png","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"主副本和輔助副本的關係","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於 checkpoint，主副本必須成功並且生成輔助本地副本失敗不會使 checkpoint 失敗。如果無法創建主副本，即使已成功創建輔助副本，checkpoint 也被認爲失敗。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"只有主副本由 JobManager 確認和管理。輔助副本由 TaskManager 擁有，生命週期可以獨立於主副本。例如，可以將 3 個最新 checkpoint 的歷史記錄保留爲主副本，並且只保留最新 checkpoint 的本地副本。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於恢復，如果有匹配的輔助副本可用，Flink 將始終嘗試從任務本地 State 先還原。如果在從輔助副本恢復期間出現任何問題，Flink 將透明地重試，從主副本恢復。僅當主副本和（可選）輔助副本都恢復失敗時，恢復纔會失敗。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"任務本地副本可能只包含完整 State 的一部分（例如，寫入本地文件時出現異常）。在這種情況下，Flink 將首先嚐試在本地恢復本地部分，無法恢復的 State 是從主副本恢復的。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"任務本地副本可以具有與主副本不同的格式。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果 TaskManager 丟失，則其所有任務的本地副本都將丟失。","attrs":{}}]}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"配置任務本地恢復","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"任務本地恢復在默認情況下是停用的，可以通過 Flink 的配置開啓（","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"state.backend.local-recovery","attrs":{}}],"attrs":{}},{"type":"text","text":" 指定爲 false 或 true，還可以在 Job 上設置 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"CheckpointingOptions.LOCAL_RECOVERY","attrs":{}}],"attrs":{}},{"type":"text","text":"）。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Allocation-preserving scheduling","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"任務本地恢復假設在失敗情況下保持分配的 Task 調度，其原理如下：每個 Task 都會記住之前分配的 Slot，在恢復過程中會請求完全相同的 Slot 進行重啓。如果 Slot 不可用，任務將從 Resource Manager 請求一個全新的 Slot。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果一個 TaskManager 不再可用，則之前分配該 TaskManager 上的 Task 必須在其他的 TaskManager 上運行，但是不會讓其他可以在原 Slot 上恢復的 Task 改變位置。在這種策略下，會讓儘可能多的 Task 在原 Slot 上啓動，並從本地恢復 State。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"horizontalrule","attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"1","attrs":{}},{"type":"text","text":"] ","attrs":{}},{"type":"link","attrs":{"href":"https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/state/large_state_tuning/","title":"","type":null},"content":[{"type":"text","text":"https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/state/large_state_tuning/ ","attrs":{}}]}]}]}

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

相關文章

實用分享！用Axure RP構建交互的5個小技巧

Axure RP是一套專門爲網站或應用程序所設計的快速原型設計工具，可以讓應用網站策劃人員或網站功能界面設計師更加快速方便的建立Web AP和Website的線框圖、流程圖、原型和規格。在Axure RP中，交互是創建豐富而逼真的原型的

2024-04-28 11:35:53

LoRA微調語言大模型的實用技巧

一、引言隨着深度學習技術的快速發展，語言大模型在自然語言處理領域取得了顯著的進展。然而，傳統的微調方法通常需要大量的計算資源和時間，對於實際應用來說並不友好。爲了解決這個問題，LoRA微調技術應運而生。LoRA（Low-Rank Adap

2024-04-28 11:30:13

系統整容紀：責任鏈設計模式的應用實戰（爆燈了，研發工期由45天降爲1天）

本文通過介紹使用責任鏈設計模式的背景和經歷，來使得讀者加深對於此設計模式的印象，甚至受到一定的啓發來對自己當下所參與、所負責的項目進行“整容”，從而提升系統的“美感”。分享工作中的點點滴滴。一、背景在下所負責的系統中有這麼一個模

2024-04-28 11:17:20

使用 @NoRepositoryBean 簡化數據庫訪問

在 Spring Data JPA 應用程序中管理跨多個存儲庫接口的數據庫訪問邏輯可能會變得乏味且容易出錯。開發人員經常發現自己爲常見查詢和方法重複代碼，從而導致維護挑戰和代碼冗餘。幸運的是，Spring Data JPA 爲這個問題提供了

2024-04-27 21:36:42

嘉爲藍鯨WeOps與DeepFlow強強聯合，共同打造拓展性運維平臺

直達原文：嘉爲藍鯨WeOps x DeepFlow | 強強聯合，共同打造拓展性運維平臺運維管理在企業信息化建設中扮演着至關重要的角色，嘉爲藍鯨WeOps一體化運維平臺致力於爲客戶提供更全面、智能的運維能力。在探索創新的過程中，我們深刻

2024-04-26 23:23:22

如何從0到1設計診斷系統

引言在整車電子電氣體系中，診斷系統的設計扮演着至關重要的角色，負責支持整車的刷寫、故障排查和EOL(End of Line)等關鍵操作。這一重要性在於這些操作的實現都依賴於診斷系統的全面支持。因此，在設計診斷系統時，必須確保

2024-04-26 22:43:26

Sealos 雲主機正式上線，便宜，便宜，便宜！

我們基於 Sealos 雲開發的能力，僅用三天時間就上線 Sealos 的雲主機能力，現在不太懂容器的同學也可以在 Sealos 上開心的使用虛擬機了，本文先說 Sealos 雲主機的優勢，再聊聊我們是怎麼這麼快實現上線的，以及爲什麼我們要

2024-04-26 21:14:40

從零開始學架構V2-架構設計流程-2

一、架構設計流程架構的設計的是爲了降低整體的複雜性，那麼架構設計的第一步就是熟悉業務，識別其中的核心訴求，僅考慮技術的話就是識別複雜度。 1.1 識別複雜度架構的複雜度主要來源於第一節中介紹的“高性能”“高可用”“可擴展”等幾個方面，實

2024-04-25 23:56:26

從零開始學架構V2-初識架構設計-1

一、架構設計的主要目的爲了解決軟件系統複雜度帶來的問題二、複雜性來源軟件的架構設計是一個非常複雜的過程；基於業務&技術現狀、公司成本、團隊規模、團隊技術能力、近三年業務發展規模預測、技術發展趨勢等條件篩選出合適的技術、編寫多種架構設計

2024-04-25 23:56:25

京東廣告研發——效率爲王：廣告統一檢索平臺實踐

1、系統概述實踐證明，將互聯網流量變現的在線廣告是互聯網最成功的商業模式，而電商場景是在線廣告的核心場景。京東服務中國數億的用戶和大量的商家，商品池海量。平臺在兼顧用戶體驗、平臺、廣告主收益的前提推送商品具有挑戰性。京東廣告檢索平臺

2024-04-25 23:17:47

三十分鐘入門基礎Go（Java小子版）

前言 Go語言定義 Go（又稱 Golang）是 Google 的 Robert Griesemer，Rob Pike 及 Ken Thompson 開發的一種靜態、強類型、編譯型語言。Go 語言語法與 C 相近，但功能上有：內存安

2024-04-25 23:17:43

Stable Diffusion中的embedding

Stable Diffusion中的embedding 嵌入，也稱爲文本反轉，是在 Stable Diffusion 中控制圖像樣式的另一種方法。在這篇文章中，我們將學習什麼是嵌入，在哪裏可以找到它們，以及如何使用它們。什麼是嵌入embe

2024-04-25 21:31:13

「實戰應用」如何用圖表控件LightningChart創建2D氣泡圖

LightningChartJS是Web上性能特高的圖表庫，具有出色的執行性能 - 使用高數據速率同時監控數十個數據源。 GPU加速和WebGL渲染確保您的設備的圖形處理器得到有效利用，從而實現高刷新率和流暢的動畫，常用於貿易，工程，航空航

2024-04-25 11:36:06

詳解數倉的向量化執行引擎

本文分享自華爲雲社區《GaussDB(DWS)向量化執行引擎詳解》，作者： yd_212508532。前言適用版本：【基線功能】傳統的行執行引擎大多采用一次一元組的執行模式，這樣在執行過程中CPU大部分時間並沒有用來處理數據，更

2024-04-25 10:33:17

百度安全多篇議題入選Blackhat Asia以硬技術發現“芯”問題

Blackhat Asia 2024於4月中旬在新加坡隆重舉行。此次大會聚集了業界最傑出的信息安全專業人士和研究者，爲參會人員提供了安全領域最新的研究成果和發展趨勢。在本次大會上，百度安全共有三篇技術議題被大會收錄，主要圍繞自動駕駛控制器安

2024-04-25 09:33:19

24小時熱門文章

最新文章

最新評論文章