DDIA 讀書筆記(6)數據庫事務

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"事務是簡化數據內部很多複雜問題的首選機制,一個事務要麼成功要麼失敗,應用層不需要擔心一部分成功的尷尬情況。本文通過討論的事務的使用場景以及解決問題的方式來理解事務。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"ACID 的不精確解釋"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"ACID 的定義最早在 1983 年提出,提出至今一直只是個概念,各家數據庫的執行力度不一,導致現在的 ACID 更像是個營銷術語。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"ACID 四個特性中,着重需要討論的點在於原子性和隔離性。一致性應該是由應用層來維護,而持久性是數據庫最基本的要求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原子性描述的是事務的完整性,即要麼成功,要麼失敗的特性。它要求事務在出錯時需要終止,並將已完成的部分寫入回滾。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"隔離性是解決併發問題的核心特性,它要求多個事務併發執行時,最終結果要跟串行執行的結果完全相同。實踐中一般不會真正地串行化執行,而是用一些更弱的形式保證。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"弱隔離級別"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"串行化隔離是最強的隔離級別,但是會嚴重影響性能,因此主流數據庫都會實現較弱的隔離級別。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"讀-提交"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"讀-提交是最基本的隔離級別,它只提供兩個保證"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"讀取時只能看到已經成功提交的數據(髒讀)"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"寫入時只會覆蓋已經成功提交的數據(髒寫)"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"防止髒寫使用的是行級鎖,同時只有一個事務能夠持有鎖。但是防止髒寫不能用鎖,會帶來嚴重的性能問題,因此主流方案是同時維護舊值和當前持有鎖的事務要設的新值兩個版本。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"快照級別隔離與可重複讀"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"讀提交保證了最基本的事務隔離,但仍然有些問題無法解決。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/b2/b26729a80256d7ea66750834f07c628b.png","alt":"讀傾斜","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Alice 兩個賬戶各有500美元,現在有一筆事務需要轉賬 100 美元,如果她在轉賬期間查看兩個賬戶的餘額,會出現一種可能,一個賬戶的讀請求發生在事務完成之後,餘額變成了 400 美元,一個賬戶的讀請求發生在事務完成之前,看到的還是 500 美元。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個問題稱爲不可重複讀,對於大多數業務場景來說這個異常是可以接受的,用戶刷新之後就能看到正確的數據。但是有些場景是不能容忍的,比如備份數據庫或是數據分析和完整性檢查,這類查詢需要讀取大量數據並進行計算,因爲時間點導致的讀取不一致會導致毫無意義的結果。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"解決這類問題需要更高的隔離級別,稱之爲 “快照級別隔離”,爲了實現這一級別,數據庫需要考慮多個事務在不同時間點讀取數據,並維護數據的多個版本,因此這個技術也稱爲多版本併發控制(MVCC)。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/a7/a7968c4cc1b12588e2a92d7fa45a96f6.png","alt":"多版本控制","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過 MVCC 技術,每個事務都有一個 ID,發起查詢的事務開始時間在轉賬事務之前,因此 ID 爲 13 的事務所做的修改對於查詢都是不可見的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"併發寫衝突問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"寫事務併發會帶來一系列值得關注的問題,髒寫只是其中的一個特例。最著名的就是更新丟失問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"應用程序從數據庫中讀取一個值,作出修改並寫回新值,由於隔離性的原因,會導致併發的兩個事務同時讀取到舊值並寫回,導致其中一個事務的結果丟失。比如遞增計數器,更新賬戶餘額等"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最好的辦法是數據庫提供原子操作,並將類似的邏輯下沉到數據庫中,如果條件不允許,就需要應用層顯示加鎖"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"寫傾斜與幻讀"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"還有一個更微妙的寫衝突問題,設想這樣一個例子,醫院的值班系統要求同一時間至少有一名醫生在值班,當前時間段有兩名醫生在列表內,然後兩個人同時申請調班,事務同時發起,都查詢到有兩個人值班,然後更新自己的記錄,併成功提交。最終結果是沒有醫生值班,違背了一開始的要求。這種兩筆事務更新不同對象導致的問題稱爲“寫傾斜”。而這種一個事務中的寫入改變了另一個事務的查詢結果的現象,稱爲“幻讀”。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這類問題沒有很好的解決方案,主要靠應用層自己去加鎖或者在索引上加鎖。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"串行化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"串行化是最強的隔離級別,它可以防止所有可能的併發問題。串行化執行有三種主流的方案"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"真-串行執行"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"兩階段加鎖"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"樂觀併發控制(可串行化的快照隔離)"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其中真串行執行在內存數據庫中有着不錯的表現,在線 OLTP 業務靠單線程往往能支撐不錯的吞吐量。而對於強調持久化的數據庫來說,串行執行會有很多限制。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"兩階段加鎖"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"兩階段加鎖(2PL)是最主流的串行化算法,與防止髒讀的鎖相比強制性更高,多個事務可以同時讀取同一個對象,但只要出現寫操作,就必須加鎖獨佔訪問。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"即 讀操作與讀操作不互斥,讀操作與寫操作互斥,寫操作與寫操作互斥。因爲使用了較爲複雜的鎖機制,因此在實際運行中很容易出現死鎖導致事務終止,應用層需要做好重試機制。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"兩階段加鎖的性能依舊不太樂觀,一個大的事務可能會阻塞後續大量的事務。在併發量的大的情況下,死鎖可能會非常頻繁,導致性能問題。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"可串行化的快照隔離"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可串行化的快照隔離(SSI)是一種新出的算法,提供了完整的可串行性保證。相比兩階段加鎖的悲觀控制,SSI 是基於樂觀併發控制的方法實現的,它仍然是通過數據庫的一致性快照來執行事務,在快照隔離的基礎上增加了相關算法來檢測寫入之間的串行化衝突來決定終止哪些事務。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要解決寫傾斜的問題,數據庫需要假定對查詢結果的任何變化都會導致寫事務失效,即以下兩種情況"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"讀取的對象是否是一個已經過期或即將過期的 MVCC 對象"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當前的寫入是否影響即將完成的讀取(讀取之後又有新的寫入)"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於第一種情況,數據庫需要跟蹤那些由於 MVCC 可見性而被忽略的操作,當事務提交時,如果有被忽略的寫操作已經提交,就需要終止當前事務"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於第二種情況,可以使用索引區間鎖來達成目的,當寫事務嘗試修改對象時,首先檢測索引上的其他讀事務,並在這些讀事務提交時通知衝突,"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於樂觀控制的 SSI 不會阻塞事務的執行,因此性能會高很多,但也有寫限制。如果併發過高導致事務的終止比例過高,也會顯著影響 SSI 的性能表現。因此 SSI 要求讀-寫型事務儘可能的要尖端,以避免衝突。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章