微服務架構下你的數據一致了嗎?

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"微服務架構的流行源於它能夠帶來更快的變化響應能力,比如獨立部署,每個服務的能力職責是獨立的,可以按需獨立發佈;再比如每個服務可以由不同的開發團隊負責,每個服務的技術棧也可以不同,可以選擇更快捷合理的方式實現不同的服務。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"然而,微服務架構作爲分佈式架構,躲不開的一個問題就是數據一致性的問題,特別是在技術異構和數據源類型不同的情況下,傳統的分佈式事務(2PC或3PC)也很難解決微服務架構下的一致性問題。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"數據怎麼會不一致呢?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在微服務架構下,多個服務之間通常會定義明確上下游關係,下游系統可以依賴上游系統,下游系統可以通過API查詢或修改上游系統的數據;反過來則不然,上游系統不應該知道下游系統的存在,也就是說上游系統不能依賴下游系統,上游系統的變化只能通過異步事件的方式發出,下游系統監聽事件並基於事件做對應的數據狀態變化。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/0c/0cc130811f6fb93da7b53d234dcc4d1a.png","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在基於上面原則的微服務架構下(見上面圖示,本文不考慮服務間循環依賴的場景),在上下游服務間的數據通信(圖示中的每個箭頭表示一種數據通信)一旦發生問題,都會產生數據不一致的場景,下面我們逐一說明:"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"場景一:下游服務數據狀態變化時同步調用上游服務接口失敗"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"舉個例子,訂單服務是下游服務,庫存服務是上游服務,在訂單確認時要鎖定庫存,實現上訂單服務在狀態變化同時通過同步API修改庫存的狀態,爲了保證數據一致性,在調用庫存服務API異常後訂單服務會回滾當前的數據狀態變更。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這個場景下,同一個業務流程,需要同時修改兩個服務的數據,在以下兩種情況下會發生數據不一致的問題:"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"庫存服務API調用成功,庫存狀態變更,但訂單狀態變更提交到數據庫時失敗,結果是庫存被鎖定,但沒有訂單變成確認。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"庫存服務API調用失敗,但實際上庫存服務的數據變更已成功,失敗原因是響應消息返回訂單服務過程中網絡異常,訂單服務回滾數據變更,結果同樣是庫存被鎖定但訂單沒有確認。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"場景二:上游服務在狀態變化時沒有發出事件"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上游服務每個關鍵狀態變更都可能觸發下游服務的一些邏輯鏈,因此上游服務發佈的事件對於下游服務是非常重要的,但這些事件並不影響上游服務自身邏輯,也不影響自身數據狀態的變化,因此通常不會設計成阻礙業務流程,那麼在事件服務或事件載體(通常是消息隊列)與上游服務之間的通信異常,就會導致上游服務的事件發佈失敗。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這種場景下,上游服務的業務流程已經成功,不可能有再次觸發事件的場景,這個事件就丟失了,下游服務因爲沒有收到上游服務的事件,數據沒有做對應的變化而導致數據不一致。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"場景三:下游服務沒有辦法正常消費上游服務的事件"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同樣,下游服務在消費事件時也很有可能因爲一些原因,導致事件的消費失敗,這些原因可能包括:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上游服務發佈事件的內容格式發生變化"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上游服務發佈事件的格式沒變,但某些字段的可選值空間變化了,比如一些枚舉值的擴充"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下游服務內部邏輯異常(數據庫、跨服務調用等)"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上游服務並不關心下游的消費者,所以對於發佈出去的事件,上游系統也不關心下游服務是否消費成功,更不會有因某個下游服務消費失敗而重發事件的邏輯,這同樣會導致類似於場景二的數據不一致。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"如何消除數據不一致?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"根據CAP理論,分區容錯性、可用性和一致性裏面必須要犧牲掉一個,而在實際實現過程中,分區容錯性和可用性是很難捨棄的,所以通常會捨棄一致性,取而代之會用最終一致性保證數據在可容忍的時長內達到最終一致。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"微服務架構也不例外,在服務內部,可以通過本地事務保證數據的強一致性;而當業務發生在多個服務中,我們追求最終一致性。那麼都有哪些措施可以保證跨服務的最終一致性呢?"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"避免同時跨服務的寫操作"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這是個業務問題,在微服務的架構下,每個服務都是獨立的,如果有一個業務功能需要同時修改兩個服務的數據,往往這個業務可以拆分成兩個步驟,比如場景一種提到的訂單和庫存的例子,如果我們可以先鎖定庫存,然後再確認訂單看上去這個問題就迎刃而解了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此在業務中發現一個功能需要同時修改兩個服務的數據,我們首先可以來討論這個業務設計是否合理;如果業務上很多場景都要求兩個服務的數據保持強一致,那可能我們需要看看微服務的劃分是否合理。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"最大努力通知 + 最大努力處理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了解決場景二和場景三的不一致性問題,需要上游服務和下游服務的共同努力:上游服務需要儘可能將事件發送出去,比如:先同步發送,如果失敗改爲異步重試,重試多次仍然失敗可以先持久化,通過定時任務來重發或者人工干預重發。下游服務也要儘可能的把事件處理掉,收到事件後可以考慮先將事件持久化,消費成功後標記事件,如果消費失敗可以通過定時任務重試消費。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"保證冪等性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當我們提到重試,就不得不考慮冪等性的問題,這裏的冪等性包括以下兩個場景:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上游服務接口的冪等性,保證下游系統的重試邏輯可以得到正確響應"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下游服務消費事件保證冪等性,避免因上游多發事件或事件已消費成功後再次重試產生的問題"}]}]}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"核心業務數據補償機制"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"即便我們做了很多我們認爲萬全的準備,在分佈式系統的執行鏈路上,每個節點都有可能失敗,加上業務的複雜度,數據不一致的情景也很難徹底解決,而對於那些小概率發生但技術解決起來成本昂貴的問題,我們可以嘗試通過對業務的深刻理解設計一些後臺的數據維護功能,保證在覈心業務數據異常時,可以在一定的規則內進行修復,從而保證業務的順利進行"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"寫在最後"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據一致性問題首先是個業務問題,其次纔是個技術問題。在微服務架構下,我們期望每個服務職責單一,這種職責單一體現的是業務價值,如果微服務的拆分過小而導致業務難以實現,那這種拆分是不合理的,業務專家們非常有必要了解系統,從業務側給出服務拆分的建議。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在數據一致性問題上,我們首先要思考業務設計的合理性,其次是當前架構設計的合理性,然後在一定的約束下,通過最終一致性保證業務價值,除非迫不得已,不建議引入分佈式事務框架,一方面成本較高,另一方面也會引入性能等新的問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章