HDFS文件的健康檢查

文章來源:HDFS DataNode Scanners and Disk Checker Explained

以下只簡單翻譯部分文字,詳情看英文原文。

簡單的概念

一個文件包含多個block,一個block有一個或多個副本。

block存儲在每臺機器的磁盤上,並且包含個blk_xxx.meta信息,meta中包含crc校驗信息等。

 

這篇文章爲了解答以下問題

datanode什麼時候檢查blocks,如何做的檢查?

datanode怎麼保證內存(in-memory)中的metadata和本地磁盤保持一致?

如果發生block讀失敗,是因爲磁盤錯誤嗎?還是因爲其他間歇性的錯誤(例如網絡中斷)?

 

Block Scanner & Volume Scanner

每個datanode有一個block scanner,一個block scanner包含有多個volume scanner,每個volume scanner掃描一個磁盤。這裏是多線程的。volume scanner需要讀取全部磁盤的數據,驗證每一個block,我們稱這個爲常規掃描(regular scans)。因爲要真實讀取數據,這是一個重IO的操作,這裏會有個限速器。

除了常規掃描外,volume scanner還維護了一份suspicious blocks(懷疑有問題的blocks列表),它是在出現讀寫錯誤的時候(不管是來自於client或者datanode),並且不是網絡錯誤,加進這個列表裏面。volume scanner會優先檢查這些文件。

每個volume有一個block cursor來保存掃描進度,重啓datanode也可以接着上次幹活。

 

hdfs-site.xml中有兩個參數

dfs.block.scanner.volume.bytes.per.second  每秒最多掃描的字節數,Default value is 1M. Setting this to 0 will disable the block scanner.

dfs.datanode.scan.period.hours 每次掃描間隔多長時間,如果掃描提前完成了,就等。如果超過時間都沒完成,就一直做完。Default value is 3 weeks (504 hours). Setting this to 0 will use the default value. Setting this to a negative value will disable the block scanner.

 

Directory Scanners

用來檢查datanode內存的元數據和磁盤實際存儲的是不是一致,主要檢查文件在不在,meta信息在不在,文件大小和內存中的是不是一樣。

如果一個block被標記爲corrupted,會通過block report彙報給namenode,namenode安排從其他完好的replicas複製過來。

 

參數配置:

dfs.datanode.directoryscan.throttle.limit.ms.per.sec controls how many milliseconds per second a thread should run. Note that this limit is taken per thread, not an aggregated value for all threads. Default value is 1000, meaning the throttling is disabled. Only values between 1 and 1000 are valid.

dfs.datanode.directoryscan.threads controls the maximum number of threads a directory scanner can have in parallel. Default value is 1.

dfs.datanode.directoryscan.interval controls the interval, in seconds, that the directory scanner thread runs. Setting this to a negative value disables the directory scanner. Default value is 6 hours (21600 seconds).

 

Disk Checker

主要檢查目錄在不在,能不能建子目錄,location路徑是不是目錄,目錄有沒有read、write、execute權限。

 

檢查時機:

While block scanners and directory scanners are activated on DataNode startup and scans periodically, the disk checker only runs on-demand, with the disk checker thread lazily created. Specifically, the disk checker only runs if an IOException is caught on the DataNode during regular I/O operations (e.g. closing a block or metadata file, directory scanners reporting an error, etc.).

 

發現壞的volume如何處理:

The reason to have the disk checker is that, if something goes wrong at the volume level, HDFS should detect it and stop trying to write to that volume. On the other hand, removing a volume is non-trivial and has wide impacts, because it will make all the blocks on that volume inaccessible, and HDFS has to handle all the under-replicated blocks due to the removal. Therefore, disk checker performs the most basic checks, with a very conservative logic to consider a failure.

 

小結:

Volume Scanner是做磁盤數據的檢查,一個DataNode由多塊磁盤組成。

Directory Scanners 負責內存和硬盤數據保持一致。

Disk Checker 負責磁盤健康性的偏硬件方面的檢查。

 

參考 <http://fatkun.com/2017/07/hdfs-health-check.html>

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章