分佈式架構的根基:深入淺出一致性算法

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}}],"text":"分佈式算法的介紹文章可謂汗牛充棟,但或是過於學術證明或是過於簡單,筆者將嘗試挑戰用一篇文章,讓近乎0基礎的同學都可以理解一致性算法的原理。","attrs":{}}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"分佈式服務的困局","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們試想一個常見的電商場景:超時訂單自動關閉,在下單後X小時內未支付的話自動關閉訂單並釋放庫存。這時我們需要有一個定時器定時觸發相關的業務操作,從高可用的角度看這個定時器需要部署多個實例,但對同一訂單僅只允許觸發一次。要實現這個需求有多種方案,最常見的就是集羣領導者選舉,可以以實例或訂單組爲維度選出領導者並由其負責執行特定訂單的觸發。領導者選舉有着廣泛的應用場景,我們可以嘗試將之抽象成獨立的服務。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/3f/3fd7108ca6b0fca74fe7ac83fb9ac647.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如上圖,實現非常簡單:","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"創建一個","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"領導者選舉服務","attrs":{}}],"attrs":{}},{"type":"text","text":",使用CAS原子化地設置變量 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"leader","attrs":{}}],"attrs":{}},{"type":"text","text":" 其值等於對應的實例Id, ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"leader","attrs":{}}],"attrs":{}},{"type":"text","text":" 的值在一定的存活週期後自動銷燬以避免服務實例不可達導致沒有可用的領導者","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"多個訂單服務向領導者選舉服務定期提交請求,希望將自己設置成領導者","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"領導者選舉服務中 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"leader","attrs":{}}],"attrs":{}},{"type":"text","text":" 的值如果存在則直接返回,反之根據先到先得原則設置 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"leader","attrs":{}}],"attrs":{}},{"type":"text","text":" 的值爲對應的實例Id並返回","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但這個方案的問題也很明顯:“領導者選舉服務”單點了,一個節點掛了會導致服務不可用。那麼能用多實例嗎?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/8c/8cff7aa4bfcae421b2e2cf215a438827.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如上圖,這是兩種多實例擴展的方案。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"方案一下不同的訂單實例會隨機路由到不同的領導者選舉服務實例,再由領導者選舉服務各實例自身實現數據同步。那麼怎麼同步?當然可以使用數據庫、Redis等中間件實現,但這導致了該服務並不純粹,我們希望這個服務不依賴於三方服務或中間件。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"方案二下要求訂單服務向所有領導者選舉服務實例發送請求,只要有一個領導者選舉服務實例存活服務整體就還是可用的,但這一方案的問題在於請求發送與接收存在網絡時延,導致不同領導者選舉服務實例收到的順序可能是不一樣的,進而無法形成統一的結果。而這正是我們所面臨的最棘手的一致性問題。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"一致性算法","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分佈式架構涉及很多方面的知識,但如果要刨根問底,探尋根基的話,那麼一定是","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"一致性(Consistency)算法","attrs":{}}],"attrs":{}},{"type":"text","text":"無疑了。一致性算法是分佈式架構的基礎,爲節點伸縮提供了核心保障。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如何在多個實例中選擇領導者,如何實現數據多副本存儲,如何設計分佈式鎖,如何確定全局ID……這一切的基石都在於一致性保障。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一致性算法很晦澀難懂,一代IT人試圖用各種方式“深入淺出”地介紹一致性問題的算法實現,但真正能被大衆接受的少之又少。接下來不先不介紹一致性的算法派系,也不去做嚴格的算法推導,我們先從問題出發,逐步深入。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們回顧上文遇到的問題:網絡的時延導致各實例接收到請求的順序可能各不相同,那麼我們是否可以從時序保證上入手呢?比如將所有請求先發到MQ,再由MQ分發請求,這的確可以解決,但是要知道的是MQ本身也需要一致性支持,這是就先有雞還是先有蛋的問題了,所以去要求嚴格的消息時序以解決一致性問題是不可能的。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們大致上總結一下分佈式數據一致性要解決的問題:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在多實例中確定一個變量的值,一旦確定後只能獲取不能修改。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"比如上文的領導者選舉就是爲確定 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"leader","attrs":{}}],"attrs":{}},{"type":"text","text":" 變量的值,所謂“確定”即要求領導者選舉服務同一時間週期內(比如一個選舉週期)對外輸出的領導者是唯一的,即要求領導者選舉服務的不同實例間對 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"leader","attrs":{}}],"attrs":{}},{"type":"text","text":" 的取值達成共識(同一時間週期內不能訂單服務inst1拿到的是 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"leader = inst1","attrs":{}}],"attrs":{}},{"type":"text","text":" ,訂單服務inst2拿到的是 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"leader = inst2","attrs":{}}],"attrs":{}},{"type":"text","text":" ),並且這同一時間週期內確定的值不能被更改(同一時間週期內在確定 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"leader = inst1","attrs":{}}],"attrs":{}},{"type":"text","text":" 的前提下訂單服務inst3發起選舉不可以更改 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"leader","attrs":{}}],"attrs":{}},{"type":"text","text":" 的取值)。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們將確定變量值的過程看做“投票”,爲規範後續的用詞,我們先做以下簡單的定義:","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Proposer 提案人,發起提案的請求方,比如上文的各訂單服務實例","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Acceptor 投票人,負責對提案發起投票,比如上文的各領導者選舉服務實例","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那麼我們思考下這投票的過程中帶來的實現難點:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"難點一:存在多個Proposer提案人併發請求導致接收到投票的時序無法保證","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要解決這個難點最直接的做法是加鎖,如下圖;","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/c3/c31a05c3a25599b3884f83f4142db96c.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們將原本一步完成的操作分成兩個步驟:","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"準備階段:Proposer向Accepters發起加寫鎖的請求Accepter收到請求返回成功或失敗(已被加過鎖)","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"投票階段:Proposer在收到所有Accepter都加鎖成功時發起投票各Accepter同意投票結果,形成確定性取值各Accepter釋放鎖","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們暫不考慮算法的效率問題,這樣的確可解決時序問題,但這要求所有Accepter都能響應請求顯然是不合理的。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"難點二:部分Accepter故障時仍然可用","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要解決這個問題我們只要修改加鎖成功的條件爲“半數以上”,如下圖:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/d4/d4fe584e3ce9c73e8f2b7ed7ce585143.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這樣服務可做到2F+1的容錯能力,即在2F+1個Accepter實例的服務中最多允許F個Accepter實例同時出現故障。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"既然Accepter允許故障,那麼Proposer也應如此,但這述算法中如果某Proposer實例獲取到鎖後發生了故障即會引發死鎖導致服務不可用。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"誠然我們可以爲鎖加過期時間(由Proposer指定本次鎖的到期時間以確保可以在同一時間釋放),但這樣做以及加鎖本身對服務的可用性/性能影響都比較大。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"難點三:任意Proposer故障時仍然可用","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們推演到現在,使用鎖的方式遇到了嚴重的挑戰,但我們可以按上述兩階段投票的方式進行改進。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於放棄加鎖的方式,那麼就不得不去直面併發請求帶來的時序問題,首先想到的應該是爲投票的提案帶上時間戳以區別提案的前後時間。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們先以只有一個Accepter的情況分析,如下圖:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/80/804e5b5fa42780d6dab19c3d40a18cb2.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於Proposer1而言,它的流程如下:","attrs":{}}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"Proposer1發起了提案號爲 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"n1","attrs":{}}],"attrs":{}},{"type":"text","text":" 的提案請求,這裏要求提案號是有序遞增的,多可使用時間戳組成","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"Acceptor收到了提案請求,將自身的 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"maxN","attrs":{}}],"attrs":{}},{"type":"text","text":" (收到的最大提案號)修改成 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"n1","attrs":{}}],"attrs":{}},{"type":"text","text":" ,並承諾 不接收小於等於 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"maxN","attrs":{}}],"attrs":{}},{"type":"text","text":" 提案號的請求","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"返回提案請求允可","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"Proposer1正式發起提案,內容爲之前的提案號及提案的值","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"Acceptor收到了提案,將自身的 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"acceptN","attrs":{}}],"attrs":{}},{"type":"text","text":" (接受的提案號) 更新爲 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"n1","attrs":{}}],"attrs":{}},{"type":"text","text":" 、 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"acceptV","attrs":{}}],"attrs":{}},{"type":"text","text":" (接受的提案值,即確定的值) 設置成 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"v1","attrs":{}}],"attrs":{}},{"type":"text","text":" , 並承諾 不處理小於 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"maxN","attrs":{}}],"attrs":{}},{"type":"text","text":" 提案號的提案","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":6,"align":null,"origin":null},"content":[{"type":"text","text":"返回提案成功","attrs":{}}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於Proposer2而言,它的流程如下:","attrs":{}}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"Proposer2發起了提案號爲 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"n2","attrs":{}}],"attrs":{}},{"type":"text","text":" 的提案請求","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"Acceptor收到了提案請求,將自身的 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"maxN","attrs":{}}],"attrs":{}},{"type":"text","text":" 修改成 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"n2","attrs":{}}],"attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"由於已形成了確定性值,所以直接返回已確定的值","attrs":{}}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從上面流程中可見,值的確定性是由 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"後者認可前者","attrs":{}}],"attrs":{}},{"type":"text","text":" 的原則保障,只要有確定性的值,後續的提案都會認可這個值。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們再看複雜些的情況:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/16/16a20b1abe83071b1617f4b8c5e1a6df.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如上圖,Proposer1與Proposer2交叉執行,它們的流程如下:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"p1-1.2.3. 同前面流程p2-1. 此時Proposer2發起了提案號爲 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"n2","attrs":{}}],"attrs":{}},{"type":"text","text":" 的提案請求p2-2. Acceptor收到了提案請求,將自身的 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"maxN","attrs":{}}],"attrs":{}},{"type":"text","text":" 修改成 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"n2","attrs":{}}],"attrs":{}},{"type":"text","text":"p2-3. 返回提案請求允可p1-4. 此時Proposer1正式發起提案,提案號 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"n1","attrs":{}}],"attrs":{}},{"type":"text","text":" 提案值 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"v1","attrs":{}}],"attrs":{}},{"type":"text","text":"p1-5. 由於已有更大的提案號 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"maxN = n2","attrs":{}}],"attrs":{}},{"type":"text","text":" ,所以返回錯誤p2-4. 此時Proposer2正式發起提案,提案號 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"n2","attrs":{}}],"attrs":{}},{"type":"text","text":" 提案值 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"v2","attrs":{}}],"attrs":{}},{"type":"text","text":"p2-5. Acceptor收到了提案,將自身的 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"acceptN","attrs":{}}],"attrs":{}},{"type":"text","text":" 更新爲 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"n2","attrs":{}}],"attrs":{}},{"type":"text","text":" 、 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"acceptV","attrs":{}}],"attrs":{}},{"type":"text","text":" 設置成 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"v2","attrs":{}}],"attrs":{}},{"type":"text","text":"p2-6. 返回提案成功p1-6. 由於Proposer1的第一次提案沒有通過,所以增加提案號後重新發起提案申請,提案號爲 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"n3","attrs":{}}],"attrs":{}},{"type":"text","text":"p1-7. Acceptor收到了提案請求,將自身的 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"maxN","attrs":{}}],"attrs":{}},{"type":"text","text":" 修改成 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"n3","attrs":{}}],"attrs":{}},{"type":"text","text":"p1-8. 由於前面已經形成了確定性值,所以直接返回之前的提案值","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從上面流程中可見,Proposer2可以搶佔Proposer1的提案權,即後發起的提案在未形成確定性值時可以搶佔現有的提案權。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"至此,我們可以容忍任意Proposer的故障,那麼存在多個Acceptor時又如何呢?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"實際上,Acceptor做的事與前面單一Acceptor場景一樣,核心在於確保Proposer向所有的Acceptor發起請求,僅當超半數Acceptor返回成功時纔算請求成功,否則重試。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e0/e0502ec07f50e461946f8a27dcd4b89e.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖略顯複雜,我們逐步分析:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"p1-1-6. 同前面流程p2-1-9. 搶佔式提案,使當前各Acceptor的 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"maxN","attrs":{}}],"attrs":{}},{"type":"text","text":" 修改成 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"n2","attrs":{}}],"attrs":{}},{"type":"text","text":"p1-7.8. Proposer1向Acceptor3(網絡時延)發起了提案請求,但在提案請求階段Acceptor不接受 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"⇐ maxN","attrs":{}}],"attrs":{}},{"type":"text","text":" 的提案號,故返回錯誤p1-9-12. 由於超半數Acceptor返回成功(前一幅圖),可以提交提案,但在提案提交階段Acceptor不接受 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"< maxN","attrs":{}}],"attrs":{}},{"type":"text","text":" 的提案號,故返回錯誤p2-10-14. 此時Proposer2超半數Acceptor返回成功,可以提交提案,由於提案請求返回中都沒有確定性值時,故使用Proposer2預設的值v2提交,超半數提案提交成功,故已形成確定性值p1-13-21. Proposer1更新提案號重新發起提案請求,各Acceptor更新 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"maxN","attrs":{}}],"attrs":{}},{"type":"text","text":" 爲最新的提案號 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"n3","attrs":{}}],"attrs":{}},{"type":"text","text":" 並返回各自已確定的值p1-22-30. 提案請求返回中存在確定性值: ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"Acceptor1的(n2, v2) 及Acceptor2的(n2, v2) ","attrs":{}}],"attrs":{}},{"type":"text","text":"使用提案號最大的確定性值做爲新提案的值,對於上例是最大提案號 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"n2","attrs":{}}],"attrs":{}},{"type":"text","text":" , 對應的值爲 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"v2","attrs":{}}],"attrs":{}},{"type":"text","text":",最終Proposer1與Proposer2都得到了確定性值 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"v2","attrs":{}}],"attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果各位能理解上述流程,那麼恭喜你,你已經掌握了一致性算法中最著名的 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"Paxos算法","attrs":{}}],"attrs":{}},{"type":"text","text":" 核心。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Paxos:開山鼻祖","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Paxos ,這是公認的最偉大的分佈式一致算法,可能沒有之一。Google的Chubby、Spanner都使用了Paxos以保證數據副本更新序列的一致性。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Paxos協議見於Leslie Lamport在1998年發的 《The Part-Time Parliament》 ,在此論文中他假設了一個叫Paxos的小島,島上的各項決定要經議會同意,議會成員都是兼職的,議員的核心角色分爲提案者(Proposers)、表決/投票者(Acceptors)。Proposer 提出提案(Proposal),提案信息包括提案編號和提議的值(Value),Acceptor 收到提案後可以接受(Accept)提案,若提案獲得多數 Acceptors 的接受,則稱該提案被批准(Chosen)。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此論文的描述晦澀難懂,以至於很多專業人士也一頭霧水,所以Lamport在2001年又發表了 《Paxos Made Simple》 以簡化說明,但這還是過於晦澀。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Paxos 算法有很多變種,包含但不限於:Basic Paxos、Multi Paxos、Fast Paxos、Byzantine Paxos……","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"值得一提的是 Paxos 能容忍消息丟失(節點不可達)、亂序,但存儲必須可靠(沒有數據丟失和錯誤),即這是“非拜占庭算法”,而 Byzantine Paxos 則解決了拜占庭場景。關於拜占庭問題我們後文會介紹。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果上文的圖示沒有看懂,那麼下文我們以 Basic Paxos 這一經典算法爲例寫僞代碼進一步闡述。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Basic Paxos","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Paxos流程描述的文章太多了,但文字的描述過於蒼白,上文我們用示例加示意圖的形式已經描述了其核心流程,下面我們再用僞代碼的形式更嚴格地描述Basic Paxos的核心流程:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/84/841f289969734bd5b5945c47e323d10b.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/ba/ba64a363a36f2c81ba8fb19afb25e7a8.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/b7/b7445e72739d5e3319f13bc8c90fcc9d.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Multi Paxos","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從上文我們可以看到Basic Paxos,算法的過程也比較複雜,確定一個值需要至少2次RPC並且可能存在活鎖(即多個Proposer交替發起提案申請,見 Wikipedia Paxos (","attrs":{}},{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=https%3A//en.wikipedia.org/wiki/Paxos_%28computer_science%29","title":null,"type":null},"content":[{"type":"text","text":"https://en.wikipedia.org/wiki/Paxos_(computer_science)","attrs":{}}]},{"type":"text","text":") 的Basic Paxos when multiple Proposers conflict章節,本文不贅述),所以一般的Paxos實現都是基於Multi Paxos,它只要約一次RPC,算法複雜度也低一些。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Basic Paxos之所以需要至少2次RPC是prepare階段無法形成確定性取值,而其中的原因在於存在多個Proposer同時提案,所以Multi Paxos的核心思想是先爲多個Proposer選舉出Leader,後續所有的提案都由這個Leader發起,這樣可以省略prepare階段,直接發起accept。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Leader選舉的過程類同Basic Paxos,需要至少2次RPC,Leader確定之後即只需要1次RPC。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"讀者可能會有疑問:爲什麼這叫 Multi Paxos?這是個好問題,Multi Paxos要解決的問題其實不止於減少RPC調用,Basic Paxos在多輪Prepare/Accept下只能確定一個值,而Multi Paxos則可以在降低延時的同時確定多個值並且保證其順序,這纔是Multi Paxos被廣泛地工程化應用的核心原因。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"關於Multi Paxos的更多介紹可參見此wiki phxpaxos ","attrs":{}},{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=https%3A//github.com/Tencent/phxpaxos/wiki","title":null,"type":null},"content":[{"type":"text","text":"https://github.com/Tencent/phxpaxos/wiki","attrs":{}}]},{"type":"text","text":"。","attrs":{}}]},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"
Tip擴展閱讀Paxos經典論文 https://www.microsoft.com/en-us/research/publication/paxos-made-simple/Wikipedia Paxos https://en.wikipedia.org/wiki/Paxos_(computer_science)知行學社——paxos和分佈式系統 https://www.bilibili.com/video/av36134550/
"}}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"總結","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文我們通過圖示及僞代碼講解了經典的 Paxos 算法實現原理,一致性算法還有很多,比如","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"Raft","attrs":{}}],"attrs":{}},{"type":"text","text":"、","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"Zab","attrs":{}}],"attrs":{}},{"type":"text","text":",不同算法間實現的邏輯有很多的共通性,可以舉一反三,如有必要筆者也會持續更新相關的內容。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"關注我的公衆號:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=https%3A//mp.weixin.qq.com/s/eiiirZaJ6tl5YsyngfvuuA","title":null,"type":null},"content":[{"type":"text","text":"分佈式架構的根基:深入淺出一致性算法​mp.weixin.qq.com","attrs":{}}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章