Hadoop0.1.0 關於NameNode如何保證每一個Block的Replicas爲配置文件指定的個數問題

     讀了Hadoop0.1.0的代碼,發現有很多事情處理的還是很奇妙的,比如分佈式文件鎖的處理,DataNode與NameNode之間通信的問題,本文先介紹一下NameNode維護Block的Replicas副本的策略,其它的將陸續寫出。

 

   Assuming it ever be set up to three replicas in the Hadoop-site.xml, The NameNode will maintain the number of those replicas.

 

   if some DataNode lost contact(Heartbeat) with the NameNode in sixty minutes, The NameNode(MetaData Server)  regards the DataNode as dead...

 

(In the NameNode, there is a HashMap<Block, TreeSet<DatanodeInfo>> that store the map of Block to DataNodeInfo objects, and a HashMap<DataNode, TreeSet<Block>> that stores the map of DataNode to Blocks.)

 

Then the NameNode gets the Blocks when removing the DataNode from the HashMap<DataNode, TreeSet<Block>>, and for each of those Blocks, it removes the corresponding DataNodeInfo object in TreeSet<DatanodeInfo> of HashMap<Block, TreeSet<DatanodeInfo>> .

 

In above processing, NameNode checks the number of DataNodeInfo objects for each removed Block. If the number is less than the one written in Hadoop-site.xml, the block will be put into a TreeSet named 'neededReplications'.

 

 

When Nameode receives the heartbeat by another Datanode that contains the blocks, it will choose a set of available datanodes that can store the blocks and send them to the datanode, and commands it to copy the data mapped by that Block to those datanodes.

 

After that, those datanodes will report to NameNode what and how many blocks it has, NameNode will check the number of each blocks.

 

-----------------------------

Notes:

There is five method of the Protocol between NameNode and DataNode.

 

The NameNode can control the DataNode removing or transfering(copying) the Data( files in local File System) mapped specific blocks.

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章