做好分庫分表其實很難之二

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"爲什麼分"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在正式開始之前,菜菜還是要強調一點,你的數據表是否應該分,需要綜合考慮很多因素,比如業務的數據量是否到達了必須要切分的數量級,是否可以有其他方案來解決當前問題?我不止一次的見過,有的leader在不考慮綜合情況下,盲目的進行表拆分業務,導致的情況就是大家不停的加班,連續幾周996,難道leader你不掉頭髮嗎?還有的架構師在一個小小業務初期就進行表拆分,大家爲了配合你也是馬不停蹄的加班趕進度,上線之後反而發現業務數據量很小,但是代碼上卻被分表策略牽制了太多。拆表引起的問題在特定的場景下,有時候代價真的很大。 數據庫表的拆分解決的問題主要是存儲和性能問題,mysql在單表數據量達到一定量級後,性能會急劇下降,相比較於sqlserver和Oracle這些收費DB來說,mysql在某些方面還是處於弱勢,但是表的拆分這個策略卻適用於幾乎所有的關係型數據庫。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據庫進行表拆分不要太盲目"}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"link","attrs":{"href":"#分表策略","title":null}},{"type":"text","text":"分表策略"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"表的拆分和數據庫的拆分有相似之處,但是拆分的規則也有不同。以下的拆分規則針對的是拆分一個表。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"link","attrs":{"href":"#橫向切分","title":null}},{"type":"text","text":"橫向切分"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"橫向切分是諸多業務中最常用的切分方式,本質是把一個表中的數據行按照規則分散到多個表中,比如最常見的按照ID範圍,按照業務主鍵的哈希值等。至於表數據到達什麼數量級之後進行切分,這和表中存的數據格式有關,比如一個表只有幾列的int字段肯定要比幾列text類型的表存儲的極限要高。姑且認爲這個極限是1000萬吧。但是作爲一個系統的負責人或者架構師來說,當表的數據量級到達千萬級別要引起重視,因爲這是一個系統性能瓶頸的隱患。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"相對於數據表的橫向切分,在符合業務優化的場景下我更傾向於做表分區,按照規則把不同的分區分配到不同的物理磁盤,這樣的話,業務裏的sql語句幾乎可以不用改動。我司的一個sqlserver數據庫,某個業務的表做了表分區之後,已經到達幾十億級別的數據量,但是查詢和插入速度還是能滿足業務的需求(優化一個系統還是要花精力優化業務層面)。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/87/8774dfa017c82b408e7607618d97b796.png","alt":"image","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"link","attrs":{"href":"#垂直切分","title":null}},{"type":"text","text":"垂直切分"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"說到垂直拆分,表也可以按照業務來拆分,比如一個數據庫中有用戶的信息,根據業務可以劃分爲基礎信息和擴展信息,如果對業務有利,完全可以拆分爲基礎信息表和擴展信息表。當然也可以按照別的規則來拆,比如把訪問頻繁的信息拆分成一個表,其他不頻繁的信息拆分成一個表,具體的拆分規則還是要看當時要解決的問題是什麼。垂直拆分可能會引入一定複雜性,比如原來查詢一個用戶的基礎信息和擴展信息可以一次性查詢出結果,分表之後需要進行Join操作或者查詢兩次才能查詢出結果。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e7/e7d5992499c5b55ddd9f441dc90c5728.png","alt":"image","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"link","attrs":{"href":"#分表代價","title":null}},{"type":"text","text":"分表代價"}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"數據表垂直切分之後,原來一次查詢有可能會變爲連表的join查詢,在一定程度上會有性能損失。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"數據表橫向切分需要一定的規則,常用的主要有兩種規則:範圍切分和哈希值切分。範圍切分是指按照某個字段的範圍來切分,比如用戶表按照用戶ID來切分,id爲1到10萬的位於User表1中,100001到200000萬的位於User2中,這樣切分的優勢是,可以無限的擴容下去,不用考慮數據遷移的問題,劣勢就是新表和舊錶數據分佈不均勻,而且分表的範圍選取有一定難度,範圍太小會導致表太多,太大會導致問題根本上沒有解決的困惑。另外一種分表策略就是把某一列按照哈希值來路由到不同的表中,同樣以用戶ID爲例,假如我們一開始就規劃了10個數據庫表,路由算法可以簡單地用 user_id %10的值來表示數據所屬的數據庫表編號,ID爲985的用戶放到編號爲 5的子表中,ID爲10086的用戶放到編號爲 6 的字表中。這種切分規則的優勢是每個表的數據分佈比較均勻,但是後期擴容會設計到部分數據的遷移工作。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"表拆分之後如果遇到有order by 的操作,數據庫就無能爲力了,只能由業務代碼或者數據庫中間件來完成了。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"當有搜索的業務需求的時候,sql語句只能是Join多個表來進行連表查詢了,類似的還有統計的需求,例如count的統計操作。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你在業務中進行過表拆分嗎?"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"更多精彩文章"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/mp/appmsgalbum?action=getalbum&album_id=1342955119549267969&__biz=MzIwNTc3OTAxOA==#wechat_redirect","title":null},"content":[{"type":"text","text":"分佈式大併發系列"}]}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/mp/appmsgalbum?action=getalbum&album_id=1342959003139227648&__biz=MzIwNTc3OTAxOA==#wechat_redirect","title":null},"content":[{"type":"text","text":"架構設計系列"}]}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/mp/appmsgalbum?action=getalbum&album_id=1342962375443529728&__biz=MzIwNTc3OTAxOA==#wechat_redirect","title":null},"content":[{"type":"text","text":"趣學算法和數據結構系列"}]}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/mp/appmsgalbum?action=getalbum&album_id=1342964237798391808&__biz=MzIwNTc3OTAxOA==#wechat_redirect","title":null},"content":[{"type":"text","text":"設計模式系列"}]}]}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/f8/f8af5984765a267892bf1a1272272625.png","alt":"image","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章