從MySQL到AWS DynamoDB數據庫的遷移實踐

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在FreeWheel的核心業務系統中,我們使用MySQL來存儲數據。但隨着數據量的不斷增加,原有數據庫已經無法滿足如今的業務需求。經過前期大量的調研,我們決定將MySQL中的部分表遷移到AWS Dynamodb中。本文主要介紹從關係型數據庫平順遷移到非關係型數據庫的實踐經驗。"}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"業務挑戰"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"最初我們使用asset表來存儲客戶的視頻庫存信息,但是隨着時間的推移,系統中的asset表體量越來越大。目前,asset表以及相關附屬表已經佔用了全部數據庫50%以上的存儲,服務中使用的表聯查操作以及複雜SQL操作都會使數據庫的性能驟降,從而導致應用服務性能變差。在此情況下,我們不得不開始考慮拆表或者數據庫遷移,其中拆表的方法並不能長久地解決這個問題。同時爲了提升性能以及擴展性、降低成本,我們最終選擇將asset及其相關表遷移出MySQL數據庫。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"主流非關係型數據庫對比及選型"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"由於我們的業務需求要求在高併發下的讀寫速度以及良好的可擴展性,並且不需要強一致性,所以我們最終決定使用非關係型數據庫來存儲asset以及相關數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在非關係型數據庫中,我們選取了幾種主流的數據庫進行對比。這裏列出其中應用較爲廣泛的MongoDB以及DynamoDB進行對比,如下表所示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

比較基礎

MongoDB

DynamoDB

簡介

MongoDB是最著名的文檔存儲之一。

DynamoDB是Amazon提供的一種可擴展的託管NoSQL數據庫服務,具有將數據存儲在Amazon雲中的功能。

數據庫結構

MongoDB使用JSON類的文檔來存儲無模式數據。在MongoDB中,不需要預定義的結構來存儲文檔的集合。

在DynamoDB中,表由項目集合組成,並且每個項目都是屬性的集合。主鍵用於唯一標識表中的每個項目,還用於DynamoDB中的輔助索引,以提供更大的查詢靈活性。

高可用性

集羣容錯,自動化災備機制。

基於雲服務的完善的災備容錯監控能力。

安全

MongoDB的默認情況下會在未啓用身份驗證的情況下進行安裝。

通過使用用戶名和強密碼啓用用戶身份驗證。 DynamoDB中的安全性更安全,並且通常由可用的AWS安全措施提供。"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"根據上述對比,基於DynamoDB有着更加完善的安全服務及災備容錯能力,並且與FreeWheel的AWS雲服務相匹配,因此我們最終決定選用DynamoDB作爲遷移的數據庫對象。下面主要介紹下DynamoDB的技術特性。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"DynamoDB技術特性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"AWS DynamoDB是一種完全託管的無服務器(Serverless)類型的NoSQL數據庫,可以通過HTTP API來使用。同時它提供了託管的內存緩存,比較適用於需要存儲大量數據並且同時要求低延遲的應用服務。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"DynamoDB有幾個關鍵概念,它是由表(tables)、數據項(items)和每項數據的屬性(attributes)來構成的。 表是數據項的集合,不同類型的數據項都可以放到一張表裏。下圖展示了這些關鍵概念的構成關係。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/b8\/b8409afedb0c9dd1d49244c06064af36.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"每條數據(item)在表裏就是一條記錄,包含了多個屬性(Attributes)。在表裏,每條數據由主鍵(Primary Key)唯一確定。每條數據類似於關係型數據庫表中的某一行或者多行的集合。數據的屬性組合成了每條數據,每條數據由多個數據屬性構成。屬性類似於關係型數據庫表中的列。DynamoDB要求每一項數據都至少包含構成該數據主鍵的屬性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"表中的每項數據由主鍵唯一標識。在創建表的時候,必須定義由哪些屬性構成主鍵。除了必要的主鍵以外,DynamoDB還提供附加索引(Secondary Index)來滿足不同的查詢模式。比如我們經常會用到的GSI(global secondary index)"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":","},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"使用不同的屬性來構成索引達到更高效的查詢。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"遷移方案設計"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"從關係型數據庫轉變到非關係型數據庫,我們需要重新定義新的數據模型。在設計新模型時,主要需要考慮的是新表中每項數據的屬性以及遷移後的數據模型能否繼續支持原有的業務需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"與關係型數據庫不同的是,DynamoDB中的表類似於表的集合,經常會用來存儲不同類型的數據,所以在結合DynamoDB的的特性以及原有的數據特點以及業務需求,我們將MySQL中的數十張表統一成了一張表,將之前不同表的不同colomn進行了重新整合,定義爲新表中的屬性,具體如下圖所示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ec\/ec67a7dae6864c2432a57532eb4186d3.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在遷移每張表的過程中,首先我們將原來在MySQL中需要遷移的相關表的SQL語句都整理了出來,利用之前所設計的主鍵以及附加索引將這些SQL語句對應到DynamoDB中各個API。下面以asset表中的一些字段爲例。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/dd\/dd09cca04e875bb96d1fe65074b45c1f.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"如上圖所示,在MySQL中asset表有name、description等列,asset_group_assignment表中有assetId、groupId等列。在遷移到DynamoDB後,這些列變成了每條item記錄的屬性值,同時從上圖中也可以看到其數據存儲類型的改變,例如原來asset表中name這一列存儲的是varchar類型,groupid與assetid都爲bigInt類型,到DynamoDB中分別對應爲String類型和Number Set類型。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在對新的數據表結構以及模型定義完成後,我們還需要定義其中各種屬性的主鍵以及根據我們的業務需求來定義其中的附加索引。比如在MySQL中我們有這樣的業務場景,select * from asset where xx_id = '123' ,如果xx_id不是主鍵的話,我們就需要將xx_id這一屬性定義成爲附加索引來滿足我們的查詢需求。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"用戶無感知平順遷移的實現"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在部署上線的過程中,爲了確保數據庫遷移過程的服務質量,並且讓用戶對此做到無感知,我們花了很大功夫將整個遷移過程分爲大致三個步驟(如下圖所示):"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/53\/53bb0c4ec342da10e1003a89a4177f45.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"數據遷移:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"首先先將MySQL中的數據進行遷移到DynamoDB中,這時所有的流量還讀寫原來的MySQL;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"數據同步:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"接下來我們部署了一個後臺job專門用於將MySQL的數據同步更新到DynamoDB中,這樣兩邊的數據就保持了一致;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"流量切換:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"之後便可以讓一些只讀的應用服務來在DynamoDB與MySQL之間切換流量進行測試,從而驗證數據遷移的正確性;最後就是一些讀寫的應用服務來進行流量的切換,我們通過程序中添加一個runtime的開關來實時的進行逐步的流量切換。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了保證在遷移過程中做到不停服的效果,我們保留了所有傳統 MySQL 的業務邏輯,程序中通過runtime的開關來判斷當前系統是讀寫 MySQL 還是 DynamoDB。所有的上層服務都會支持這個邏輯從而判斷開關的狀態進而判定讀寫的數據源是 MySQL 還是 DynamoDB。而開發人員則可以通過實時更新開關的狀態,從而在遇到問題的時候,及時在兩個數據源MySQL與DynamoDB之間進行切換,從而避免用戶問題的產生。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在流量切換過程中,分爲三種狀態:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/84\/8409d2bde7eb3bad19d86fdd4f5db603.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"第一個狀態是開始切流量之前此時所有服務的讀寫還在 MySQL 中,DynamoDB 可以看作爲一個 back up 的數據庫。在這個階段中,我們將所有寫入 MySQL 的數據同步到 DynamoDB 中。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"接下來,我們將流量逐漸從 MySQL 中切換到 DynamoDB 中。如果是關閉開關的流量,所有應用服務還是會讀寫 MySQL,並將 MySQL 的數據同步到 DynamoDB 中。如果打開開關的流量,則所有應用服務都會讀寫 DynamoDB 並且將 DynamoDB 的數據同步回 MySQL,從而保證 MySQL 和 DynamoDB 中的數據是一致的,以應對出現問題後的遷移回滾操作。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"最後,在遷移後並測試驗證後,這時所有應用服務流量都切換到了 DynamoDB,此時 DynamoDB 的數據仍然會同步到 MySQL,這時 MySQL 就可以看作另一個 back up數據庫以備不時之需。至此,我們就完成了整個數據遷移工作。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"遷移中遇到的問題及解決方案"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"關係型與非關係型數據庫不論是在數據存儲類型上還是對數據的操作上都存在着很大差別,這就導致我們在對數據庫操作的接口實現上會有明顯的不同。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"下面主要列出我們在實踐過程中所發現的由於兩種數據庫的特性的不同之處所帶來的一些變化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

區別

MySQL

DynamoDB

數據類型

BigInt, TinyInt, Int, varchar,enum...

number, string, set, map, list...

SQL

支持

不支持

默認值

支持

不支持

大小寫敏感

不敏感

敏感

自增ID

支持

不支持

唯一鍵

支持

不支持"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"存儲類型的變化"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"由於我們的核心業務系統使用的語言是Golang,所以在從MySQL到DynamoDB的遷移實現過程中,由於數據存儲類型的變化,微服務程序中需要重新按照DynamoDB中的數據類型重新定義數據結構。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"NO SQL 的轉變"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在遷移的具體實現中,首先我們將原來在MySQL中需要遷移的相關表的SQL語句都整理了出來,利用之前所設計的主鍵以及附加索引將這些SQL語句對應到DynamoDB中各個API。這個過程中我們發現NoSQL帶來的性能提升還是很大的,比如原來在MySQL中一個更新需求涉及到多張表可能需要建立幾個甚至更多的數據庫鏈接,而在DynamoDB中只要一個數據庫操作就能完成整條記錄的更新。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"默認值的變化"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在MySQL中是有默認值的,而在DynamoDB是沒有默認值存在的,如果不傳某種屬性的寫入,該條記錄則沒有對應屬性。爲了MySQL中所留下的默認值的業務需求,我們在DynamoDB的寫入時也做了相應的處理,具體如下圖所示。如果該屬性的類型是string時, 當沒有傳入這種屬性時,默認寫入Null值,如果該屬性的類型時int,當沒有傳入改屬性時默認寫入0。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/e8\/e8c5c005f092b34b2bd5481a41d80cd7.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"大小寫敏感的變化"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在遷移前的業務系統的在查詢過程中是大小寫不敏感的(linux系統下MySQL默認情況是大小寫不敏感的),在遷移之後,DynamoDB是默認大小寫敏感的,因此爲了仍然能夠滿足大小寫不敏感這一業務需求,我們專門爲需要大小寫不敏感的屬性改成了全部小寫作爲一個新的屬性定義在存儲結構中來滿足需求"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"自增ID的變化"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"DynamoDB 不支持自增 ID, 但是我們傳統的業務需要支持,所以我們需要在業務層面加了一張表來實現自增 ID。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"除了上述由於數據庫特點不一致所帶來的實現上的變化之外,我們在遷移的過程中也發現了一些由於DynamoDB的限制所引發的一些問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"數據一致性問題"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在併發測試的過程中,我們發現了這樣一種現象。以下圖爲例,當有兩個請求同時操作一條記錄asset1時,我們預期的結果是asset1的groups在兩個請求之後在原有的基礎增加兩個請求所添加的值,但實際上只添加了一個。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/8c\/8c95f889486a01ef5d83530519696ba8.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這個現象是由於請求2本該讀到的記錄應該是請求1更新之後的記錄,但因爲兩個併發請求同時讀到的都是更新之前的記錄,所以最終更新成的值也就不是我們預期的值。說到底,其實就是想要達到強一致性讀的效果,但實際上是最終一致性。因爲DynamoDB 使用的是最終一致性讀取,雖然它也提供了一個 ConsistentRead 參數來支持強一致性讀取,但是隻有主鍵支持,全局二級索引是不支持強一致性讀取的。所以我們在表中加了version這一屬性來控制同時寫入的順序問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/74\/7419e1adadd433d6993e79b5f560e8cc.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"GSI delay 導致的問題"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在開發完成後做壓力測試時,我們發現調用創建新記錄的接口總是會出現失敗的情況。原因是當客戶端發起創建新記錄的請求後,服務端會先在主表中創建數據,然後會通過GSI拿到新創建的這條記錄。在這種情況下,有萬分之五的概率會拿不到新創建的數據,因爲DynamoDB主表到其GSI的同步過程存在延時(如下圖所示),AWS官方給出的數據是豪秒級的延時。針對這一問題,我們在服務端增加了重試邏輯,如果沒有拿到新創建的數據,最多會重試三次。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/71\/719d56cf6f04e2e23eacbe5e7fe71d60.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"DynamoDB數據大小的限制"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在極限值的測試中我們發現,在更新一個asset的別名屬性時,其屬性的類型是數組,當其個數超過1000個的時候會發生更新失敗的現象。通過查閱DynamoDB的官方文檔,我們發現對於DynamoDB的每個屬性的value,DynamoDB都是有大小限制的,佔用內存不能超過400KB。當然這只是在測試極限值時發現的問題,實際業務中並不會出現這樣的情況,但爲了以防出現問題,我們也在實際的業務中添加了驗證的業務邏輯,並提前通知了客戶這一變化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"DynamoDB 的事務問題"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"起初我們使用DynamoDB官方提供的TransactWriteItems API來處理多張表同時更新的事務問題,示例代碼如下圖所示。但在併發測試的過程中我們發現,如果同時操作非常多的記錄的情況下,服務會報錯。原因是目前DynamoDB的事務還不支持超過 25 個以上的 item 寫入操作。所以當遇到要同時操作25個以上item的寫入時,我們放棄了原生提供的事務方法,通過加悲觀鎖以及補償的方式實現了此種業務需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/d6\/d6fbed5a601feb8e8bd285a7c510b649.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"DynamoDB的 Cost 問題"}]}]}]},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"

類型

價格

特別情況

WCU 寫入容量單位

每百萬WCU $1.25

事務雙倍

RCU 讀取容量單位

每百萬RCU $0.25

強一致性讀雙倍"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在使用DynamoDB時一定要注意花銷問題。如上表所示,DynamoDB中每百萬寫入"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#333333","name":"user"}}],"text":"容量單位"},{"type":"text","text":"WCU花費1.25$, 每1KB數據的寫入會花費1WCU, 如果是事務會加倍。每百萬的讀取容量單位RCU花費0.25$,每4KB的讀會花費0.5個RCU,如果是強一致性讀會加倍。所以在使用DynamoDB時,如果不是必須的操作,需要儘量避免使用強一致性讀,並且通過儘可能將多次寫操作合併爲一次操作來減少寫入的花銷。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"結語"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"通過團隊的共同努力,我們在數個月的時間內完成了從MySQL到DynamoDB的數據存儲遷移,也見證了遷移之後所帶來的應用服務及數據庫性能所帶來的巨大提升,下圖爲遷移前和遷移後的同一接口的請求時間對比,可以看到遷移前Duration平均爲90ms,而遷移後的Duration降爲平均50ms,降低了近50%。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/8c\/8c90cefdaaf9dddfc8ff4f95deb15a8d.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在完成遷移後,我們也不斷髮現一些問題,例如跨數據庫的transaction處理以及對DynamoDB的數據進行復雜查詢等等,未來我們也會針對這些問題繼續探索解決辦法並不斷改進。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"作者介紹:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"嶽京典,畢業於北京郵電大學,目前就職於FreeWheel核心業務團隊。致力於Golang系統開發、微服務架構等,熱衷於新技術的分享與探索。"}]}]}

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章