HDFS文件的健康检查

文章来源:HDFS DataNode Scanners and Disk Checker Explained

以下只简单翻译部分文字,详情看英文原文。

简单的概念

一个文件包含多个block,一个block有一个或多个副本。

block存储在每台机器的磁盘上,并且包含个blk_xxx.meta信息,meta中包含crc校验信息等。

 

这篇文章为了解答以下问题

datanode什么时候检查blocks,如何做的检查?

datanode怎么保证内存(in-memory)中的metadata和本地磁盘保持一致?

如果发生block读失败,是因为磁盘错误吗?还是因为其他间歇性的错误(例如网络中断)?

 

Block Scanner & Volume Scanner

每个datanode有一个block scanner,一个block scanner包含有多个volume scanner,每个volume scanner扫描一个磁盘。这里是多线程的。volume scanner需要读取全部磁盘的数据,验证每一个block,我们称这个为常规扫描(regular scans)。因为要真实读取数据,这是一个重IO的操作,这里会有个限速器。

除了常规扫描外,volume scanner还维护了一份suspicious blocks(怀疑有问题的blocks列表),它是在出现读写错误的时候(不管是来自于client或者datanode),并且不是网络错误,加进这个列表里面。volume scanner会优先检查这些文件。

每个volume有一个block cursor来保存扫描进度,重启datanode也可以接着上次干活。

 

hdfs-site.xml中有两个参数

dfs.block.scanner.volume.bytes.per.second  每秒最多扫描的字节数,Default value is 1M. Setting this to 0 will disable the block scanner.

dfs.datanode.scan.period.hours 每次扫描间隔多长时间,如果扫描提前完成了,就等。如果超过时间都没完成,就一直做完。Default value is 3 weeks (504 hours). Setting this to 0 will use the default value. Setting this to a negative value will disable the block scanner.

 

Directory Scanners

用来检查datanode内存的元数据和磁盘实际存储的是不是一致,主要检查文件在不在,meta信息在不在,文件大小和内存中的是不是一样。

如果一个block被标记为corrupted,会通过block report汇报给namenode,namenode安排从其他完好的replicas复制过来。

 

参数配置:

dfs.datanode.directoryscan.throttle.limit.ms.per.sec controls how many milliseconds per second a thread should run. Note that this limit is taken per thread, not an aggregated value for all threads. Default value is 1000, meaning the throttling is disabled. Only values between 1 and 1000 are valid.

dfs.datanode.directoryscan.threads controls the maximum number of threads a directory scanner can have in parallel. Default value is 1.

dfs.datanode.directoryscan.interval controls the interval, in seconds, that the directory scanner thread runs. Setting this to a negative value disables the directory scanner. Default value is 6 hours (21600 seconds).

 

Disk Checker

主要检查目录在不在,能不能建子目录,location路径是不是目录,目录有没有read、write、execute权限。

 

检查时机:

While block scanners and directory scanners are activated on DataNode startup and scans periodically, the disk checker only runs on-demand, with the disk checker thread lazily created. Specifically, the disk checker only runs if an IOException is caught on the DataNode during regular I/O operations (e.g. closing a block or metadata file, directory scanners reporting an error, etc.).

 

发现坏的volume如何处理:

The reason to have the disk checker is that, if something goes wrong at the volume level, HDFS should detect it and stop trying to write to that volume. On the other hand, removing a volume is non-trivial and has wide impacts, because it will make all the blocks on that volume inaccessible, and HDFS has to handle all the under-replicated blocks due to the removal. Therefore, disk checker performs the most basic checks, with a very conservative logic to consider a failure.

 

小结:

Volume Scanner是做磁盘数据的检查,一个DataNode由多块磁盘组成。

Directory Scanners 负责内存和硬盘数据保持一致。

Disk Checker 负责磁盘健康性的偏硬件方面的检查。

 

参考 <http://fatkun.com/2017/07/hdfs-health-check.html>

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章