hdfs的HA以及Yarn的HA高可用

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\tHA(High Available), 高可用,是保證業務連續性的有效解決方案,一般有兩個或兩個以上的節點,分爲活動節點(Active)及備用節點(Standby)。通常把正在執行業務的稱爲活動節點,而作爲活動節點的一個備份的則稱爲備用節點。當活動節點出現問題,導致正在運行的業務(任務)不能正常運行時,備用節點此時就會偵測到,並立即接續活動節點來執行業務。從而實現業務的不中斷或短暫中斷。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\tHadoop1.X版本,NN是HDFS集羣的單點故障點,每一個集羣只有一個NN,如果這個機器或進程不可用,整個集羣就無法使用。爲了解決這個問題,出現了一堆針對HDFS HA的解決方案(如:Linux HA, VMware FT, shared NAS+NFS, BookKeeper, ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"QJM/Quorum Journal Manager","attrs":{}},{"type":"text","text":", BackupNode等)。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\t在HA具體實現方法不同情況下,HA框架的流程是一致的, 不一致的就是如何存儲、管理、同步edits編輯日誌文件。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\t在Active NN和Standby NN之間要有個共享的存儲日誌的地方,Active NN把edit Log寫到這個共享的存儲日誌的地方,Standby NN去讀取日誌然後執行,這樣Active和Standby NN內存中的HDFS元數據保持着同步。一旦發生主從切換Standby NN可以儘快接管Active NN的工作。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"9.1 Namenode HA介紹","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\thadoop2.x之後,Clouera提出了QJM/Qurom Journal Manager,這是一個基於Paxos算法(分佈式一致性算法)實現的HDFS HA方案,它給出了一種較好的解決思路和方案,QJM主要優勢如下:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\t不需要配置額外的高共享存儲,降低了複雜度和維護成本。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\t消除spof(單點故障)。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\t系統魯棒性(Robust)的程度可配置、可擴展。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\t基本原理就是用2N+1臺 JournalNode 存儲EditLog,每次寫數據操作有>=N+1返回成功時即認爲該次寫成功,數據不會丟失了。當然這個算法所能容忍的是最多有N臺機器掛掉,如果多於N臺掛掉,這個算法就失效了。這個原理是基於Paxos算法。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\t在HA架構裏面SecondaryNameNode已經不存在了,爲了保持standby NN時時的與Active NN的元數據保持一致,他們之間交互通過JournalNode進行操作同步。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\t任何修改操作在 Active NN上執行時,JournalNode進程同時也會記錄修改log到至少半數以上的JN中,這時 Standby NN 監測到JN 裏面的同步log發生變化了會讀取 JN 裏面的修改log,然後同步到自己的目錄鏡像樹裏面,如下圖:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\t當發生故障時,Active的 NN 掛掉後,Standby NN 會在它成爲Active NN 前,讀取所有的JN裏面的修改日誌,這樣就能高可靠的保證與掛掉的NN的目錄鏡像樹一致,然後無縫的接替它的職責,維護來自客戶端請求,從而達到一個高可用的目的。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\t在HA模式下,datanode需要確保同一時間有且只有一個NN能命令DN。爲此:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\t每個NN改變狀態的時候,向DN發送自己的狀態和一個序列號。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\tDN在運行過程中維護此序列號,當failover時,新的NN在返回DN心跳時會返回自己的active狀態和一個更大的序列號。DN接收到這個返回則認爲該NN爲新的active。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\t如果這時原來的active NN恢復,返回給DN的心跳信息包含active狀態和原來的序列號,這時DN就會拒絕這個NN的命令。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Failover Controller :","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\tHA模式下,會將FailoverController部署在每個NameNode的節點上,作爲一個單獨的進程用來監視NN的健康狀態。","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"FailoverController主要包括三個組件:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\tHealthMonitor: 監控NameNode是否處於unavailable或unhealthy狀態。當前通過RPC調用NN相應的方法完成。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\tActiveStandbyElector: 監控NN在ZK中的狀態。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\tZKFailoverController: 訂閱HealthMonitor 和ActiveStandbyElector 的事件,並管理NN的狀態,另外zkfc還負責解決fencing(也就是腦裂問題)。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"​\t上述三個組件都在跑在一個JVM中,這個JVM與NN的JVM在同一個機器上。但是兩個獨立的進程。一個典型的HA集羣,有兩個NN組成,每個NN都有自己的ZKFC進程。","attrs":{}}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章