hadoop 2.4.0 使用distcp有關問題解決

hadoop distcp hftp://nn.xxx.xx.com:50070/user/nlp/warehouse/t_m_user_key_action    /user/nlp/warehouse/dw1

出現

   Caused by: java.io.IOException: Check-sum mismatch between hftp://xxx:50070/foo/yyy.yy and hdfs://dst:8020/foo/xxx.xx



引用

— Distcp using MRv2 (YARN) from a CDH3 cluster to a CDH4 cluster may fail with CRC mismatch errors

Running distcp on a CDH4 YARN cluster with a CDH3 hftp source will fail if the CRC checksum type being used is the CDH4 default (CRC32C). This is because the default checksum type was changed in CDH4 from the CDH3 default of CRC32.

Bug: HADOOP-8060
Severity: Medium
Anticipated Resolution: To be fixed in an upcoming release
Workaround: You can work around this issue by changing the CRC checksum type on the CDH4 cluster to the CDH3 default, CRC32. To do this set dfs.checksum.type to CRC32 in hdfs-site.xml.


 在hdfs-site.xml文件裏面添加:

   <property>
<name>dfs.checksum.type</name>
<value>CRC32</value>
</property>


注意執行命令的集羣已經要有另一個集羣的所有hosts文件。

發佈了31 篇原創文章 · 獲贊 24 · 訪問量 72萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章