數據庫事務的三個元問題

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"究竟什麼是數據庫的事務?爲什麼數據庫需要支持事務?爲了實現數據庫事務,各種數據庫是如何設計的?這次只談理解,歡迎大家來討論。"}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"1. 數據庫事務是什麼"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"事務,就是用來保證數據操作符合業務邏輯要求而實現的一系列功能。換句話說,如果數據庫不支持事務,上層業務系統的程序員就需要自己寫代碼,以保證相關數據處理邏輯的正確性。舉個例子,數據庫最開始普及就是在金融業,銀行的存取款場景就是一個最典型的OLTP數據庫場景,而事務就是用來保證類似場景的業務邏輯正確性的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/d0\/19\/d024695860e1be52f9e47c86b9ed4f19.png","alt":null,"title":"(圖片來自網絡,侵刪)","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原子性:"},{"type":"text","text":"如果你要給家人轉賬,必須在你的賬戶里扣掉100塊,在家人賬戶里加上100塊,這兩筆操作需要一起完成,業務邏輯纔是正確的。但是程序在做修改時,肯定會有先後順序,試想一下程序扣了你的錢,這個時候程序崩潰了,家人賬戶的錢沒有加上,那這100塊是不是消失了?你是不是要發瘋?那麼,就把這兩筆操作放進一個事務裏,通過原子性保證,這兩筆操作要麼都成功,要麼都失敗。這樣才能保證業務邏輯的正確性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"一致性:"},{"type":"text","text":"有很多文章講過一致性,但是很多人會把一致性跟原子性混在一起說。事務的一致性指的是,每一個事務必須保證執行之後所有庫內的規則依舊成立,比如內外鍵、constraint、觸發器等。舉例來說,你在儲蓄卡里有100元,理財賬戶裏有100元,基金賬戶有100元,那麼你在資產總和裏會看到300元,這300元必須是三個賬戶餘額加在一起得到的。你從儲蓄卡里轉出去了100元給家人,那麼可以在數據庫上創建觸發器,當儲蓄卡餘額賬戶減100元的同時,把資產總和也同步減去100元,不然就會出現邏輯上的錯誤。你已經轉走了100元儲蓄卡餘額,實際資產總和應該是200元,若還是300元,數據庫狀態就不一致了。因此實現事務的時候,必須要保證相關聯的觸發器以及其他內部規則都執行成功,事務纔算執行成功。如果在減去資產總額時出錯,數據庫就會進入不一致的狀態"},{"type":"text","marks":[{"type":"strong"}],"text":","},{"type":"text","text":"那麼這筆轉帳交易也不能成功。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那麼一致性跟原子性的區別到底在哪裏呢?"},{"type":"text","marks":[{"type":"strong"}],"text":"原子性是指多個用戶指令之間必須作爲一個整體完成或失敗,而一致性更多是數據庫內的相關數據規則同時完成或失敗。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"持久性:"},{"type":"text","text":"事務只要提交了,對數據庫的修改就會保存下來不會丟了。簡單來說,只要提交了,數據庫就算崩潰了,重啓之後你剛存的100塊依然在你的賬戶裏。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"隔離性:"},{"type":"text","text":"每個事務相對於其他的事務有一定獨立性、不能互相影響,因爲數據庫需要支持併發的操作來提高效率。在併發操作時,一定要通過操作之間的隔離來保證業務邏輯的正確性。比如,你轉帳100塊給家人,一系列操作的最後一步可能是輸入驗證碼,這個時候轉帳還沒有完成,但是在數據庫裏,你的賬戶對應的記錄中已經減去100塊,家人賬戶也加了100塊,就等着驗證碼輸入以後,事務提交,完成操作。那麼,這個時候,家人通過手機銀行能夠查到這100塊麼?你的答案可能是不能,因爲你的轉帳操作還沒有提交,事務還沒有完成,那麼數據庫就應該保證這兩個併發操作之間具有一定的隔離性,這樣才符合業務邏輯。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"到底應該隔離到什麼程度呢?隔離性又分爲4個等級:由低到高依次爲Read uncommitted(讀未提交)、Read committed(讀提交)、Repeatable read(可重複讀取)、Serializable(序列化),這四個級別可以逐個解決髒讀、不可重複讀、幻象讀這幾類問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"怎麼理解不同的隔離等級呢?首先要理解併發操作,併發操作就是指有不同的用戶同時對一個數據進行讀、寫操作,那麼在這個過程中,每個用戶應該看到什麼數據才能保證業務邏輯的正確性呢? "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果是前面存取款的場景,我看到的是已經存進來的錢,也就是必須是已經提交的事務。而12306刷火車票,你可以看到有10張餘票,但是在下單的時候告訴你票賣完了,因爲同時有10個用戶把票買掉了,你需要重新刷餘票。這個也是可以接受的。也就是說我可以讀到一些虛假的餘票,在業務上也沒有什麼問題。那麼在設計這兩個不同系統時,就可以選擇不同的事務隔離級別來實現不同的併發效果。不同的隔離等級就是在系統的併發性和數據邏輯的嚴謹性之間做出的平衡。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"2. 數據庫如何實現事務"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據庫實現事務會有多種不同的方式,但基本的原理類似,比如都需要對事務進行統一的編號處理,都需要記錄事務的狀態(是成功了還是失敗了),都需要在數據存儲的層面對事務進行支持,以明確哪些數據是被哪些事務插入、修改和刪除的。同時還會記錄事務日誌等,對事務進行系統化的管理以實現數據的原子性、一致性和持久性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要實現事務的隔離性,最基礎的就是通過加鎖機制,把併發操作適當串行化來保證數據操作的正確邏輯。但是爲了要保證系統具有良好的併發性能,必須要在實現事務隔離性時找到合理的平衡點。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大部分數據庫(包括Oracle、MySQL、Postgres在內)在做併發控制的時候,都會採用MVCC(多版本併發控制)的機制來保證系統具有較高的併發性。不同數據庫實現MVCC的具體方案不盡相同,但其基本原理類似。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"3. MVCC實現原理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所謂MVCC,就是數據庫中的同一查詢根據相關事務執行的先後順序以及隔離級別的不同,可能會存在不同版本的結果,通過這樣的手段來保證大部分查詢操作不會被修改操作阻塞並保證數據邏輯的正確性。簡單來說就是,用存儲空間來交換併發能力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面以Postgres爲例介紹一下MVCC的一種實現方式,下圖用以解釋Posrgres裏最基本的數據可見性是如何實現多版本控制的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/aa\/7f\/aab6fa135d994fa59206cc3aca71967f.png","alt":null,"title":"(圖片來自網絡,侵刪)","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先,Postgres裏的每一個事務都有編號,這裏可以簡單理解爲時間順序編號,編號越大的事務發生越晚。然後,數據庫裏的每一行記錄都會保存創建這條記錄的事務號(Cre),也會在記錄刪除時保存刪除這條記錄的事務號(Exp),換句話說,只要Exp這裏一列裏記錄了事務編號,就說明這條記錄被刪除了。那麼一個事務應該能看見哪些記錄呢?Postgres裏每一個事務都會保存一個當前系統的事務快照(Snapshot),這個快照裏會保存事務創建時當前系統的最高(最晚)事務編號,以及目前還在進行中的事務編號。在如上圖所示的一個事務的快照裏,最高事務編號爲100,目前正在進行的事務有25、50和75。對應左邊數據記錄,這6行數據的可見性就如同標註的一般:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第一行,Cre 30,沒有刪除,在100這個時間點,應該能看到。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二行,Cre 50,沒有刪除,但是50這個事務還沒有提交,正在進行中,所以看不見。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第三行,Cre 110,沒有刪除,但是100這個時間點110事務還沒有發生,所以看不見。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第四行,Cre 30,Exp 80,在80的時候數據被刪掉了,所以看不見。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第五行,Cre 30,Exp 75,在30的時候被創建,75時候被刪掉了,但是75這個事務在100的時候還沒有提交,這條記錄在100的時候還沒有刪掉,所以看得見。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第六行,Cre30,Exp 110,在30的時被創建,110時候被刪掉,但是在100時候,110還沒有發生,所以看得見。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"綜上就是這個事務對這六條記錄的可見性,也就是一個數據版本。大家可以看一下,如果另一個事務的快照裏存的是最高事務編號爲110,正在進行的事務爲50,那麼它能看到的數據應該是哪幾行呢?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同時大家也看到,Postgres裏刪除一行數據就是在這一行的Exp這個列記錄一個刪除事務的編號。相當於做了一個刪除標記,而數據沒有真正被刪除,因此Postgres數據庫需要定期做數據清理操作(Vacuum)。我們這裏假定所有的事務最終都是正確提交了,Postgre在現實場景裏的可見性比介紹的要複雜,會存在某些事務沒有提交的情況,這裏不再展開。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章