目錄
幾點關鍵的背景
正確的B+樹併發控制機制需要滿足以下幾點要求:
正確的讀操作:
- R.1 不會讀到一個處於中間狀態的鍵值對:讀操作訪問中的鍵值對正在被另一個寫操作修改
- R.2 不會找不到一個存在的鍵值對:讀操作正在訪問某個樹節點,這個樹節點上的鍵值對同時被另一個寫操作(分裂/合併操作)移動到另一個樹節點,導致讀操作沒有找到目標鍵值對
正確的寫操作:
- W.1 兩個寫操作不會同時修改同一個鍵值對
無死鎖:
- D.1 不會出現死鎖:兩個或多個線程發生永久堵塞(等待),每個線程都在等待被其他線程佔用並堵塞了的資源
一些名詞
- SL (Shared Lock): 共享鎖 — 加鎖
- SU (Shared Unlock): 共享鎖 — 解鎖
- XL (Exclusive Lock): 互斥鎖 — 加鎖
- XU (Exclusive Unlock): 互斥鎖 — 解鎖
- SXL (Shared Exclusive Lock): 共享互斥鎖 — 加鎖
- SXU (Shared Exclusive Unlock): 共享互斥鎖 — 解鎖
- R.1/R.2/W.1/D.1: 併發機制需要滿足的正確性要求
- safe nodes:判斷依據爲該節點上的當前操作是否會影響祖先節點。以傳統B+樹爲例:(1) 對於插入操作,當鍵值對的數量小於M時,插入操作不會觸發分裂操作,該節點屬於safe node;反之當鍵值對數量等於M時,該節點屬於unsafe node;(2)對於刪除操作,當鍵值對的數量大於M/2時,不會觸發合併操作,該節點屬於safe node;反之當鍵值對數量等於M/2時,該節點屬於unsafe node。當然,對於MySQL而言,一個節點是否是安全節點取決於鍵值對的大小和頁面剩餘空間大小等多個因素,詳細代碼可查詢MySQL5.7的btr_cur_will_modify_tree()函數。
| S|SX| X|
–+–+–+–+
S| o| o| x|
–+–+–+–+
SX| o| x| x|
–+–+–+–+
X| x| x| x|
–+–+–+–+
.
S鎖和X鎖與之前的邏輯相同,沒有做變動
SX與SX和X互斥,與S共享,內部定義爲RW_SX_LATCH,根據描述,在加上SX鎖之後,不會影響讀操作,但阻塞寫操作
SMO過程
MySQL5.6 SMO分析
MySQL 5.6 SMO代碼分析
row_ins_clust_index_entry
| | ==> //樂觀插入
| | ==> err = row_ins_clust_index_entry_low(0, BTR_MODIFY_LEAF, index, n_uniq, entry, n_ext, thr);
| | ==> mtr_start(&mtr);
| |
| |
| | ==> //mode is BTR_MODIFY_LEAF or BTR_MODIFY_TREE
| | ==> btr_cur_search_to_nth_level(index, 0, entry, PAGE_CUR_LE, mode, &cursor, 0, __FILE__, __LINE__, &mtr);
| | ==> //加Index鎖
| | ==> mtr_x_lock(dict_index_get_lock(index), mtr);
| | ==> mtr_s_lock(dict_index_get_lock(index), mtr);
| | ==> //獲取根節點的邏輯地址
| | ==> space = dict_index_get_space(index);
| | ==> page_no = dict_index_get_page(index);
| | ==> //非葉子節點,搜索時不加鎖,葉子節點搜索時加s\x鎖
| | ==> if (height != 0) {rw_latch = RW_NO_LATCH;}
| | ==> else{rw_latch = latch_mode;}
| | ==>
| | ==> block = buf_page_get_gen(space, zip_size, page_no, rw_latch, guess, buf_mode, file, line, mtr);
| | ==> if(height == ULINT_UNDEFINED) 確定樹高
| | ==> // if(height == 0 && 樂觀插入),釋放index lock
| | ==> mtr_release_s_latch_at_savepoint(dict_index_get_lock(index))
| | ==> //page內的搜索,http://blog.itpub.net/7728585/viewspace-2144744/
| | ==> page_cur_search_with_match
| | ==> /* Go to the child node */
| | ==> page_no = btr_node_ptr_get_child_page_no(node_ptr, offsets);
| | ==> //樂觀插入
| | ==> err = btr_cur_optimistic_insert(flags, &cursor, &offsets, &offsets_heap, entry, &insert_rec, &big_rec, n_ext, thr, &mtr)
| | ==> //判斷空間是否足夠
| | ==> dict_index_get_space_reserve
| | ==> mtr_commit(&mtr);
| | ==> //悲觀插入
| | ==> if (err == DB_FAIL)row_ins_clust_index_entry_low(0, BTR_MODIFY_TREE, index, n_uniq, entry, n_ext, thr)
| | ==> mtr_start(&mtr);
| |
| |
| | ==> //mode is BTR_MODIFY_LEAF or BTR_MODIFY_TREE
| | ==> btr_cur_search_to_nth_level(index, 0, entry, PAGE_CUR_LE, mode, &cursor, 0, __FILE__, __LINE__, &mtr);
| | ==> //嘗試樂觀插入
| | ==> err = btr_cur_optimistic_insert(flags, &cursor, &offsets, &offsets_heap,
entry, &insert_rec, &big_rec, n_ext, thr, &mtr);
| | ==> //悲觀插入
| | ==> err = btr_cur_pessimistic_insert(flags, &cursor, &offsets, &offsets_heap,
entry, &insert_rec, &big_rec, n_ext, thr, &mtr);
| | ==> btr_page_split_and_insert(flags, cursor, offsets, heap, entry, n_ext, mtr)
| | ==> //修改樹結構
| | ==> btr_attach_half_pages
| | ==> btr_insert_on_non_leaf_level(flags, index, level + 1, node_ptr_upper, mtr)
| | ==> 這裏是修改父節點,可能會出現進一步的SMO
| | ==> btr_insert_on_non_leaf_level_func
| | ==> //釋放index lock
| | ==> mtr_memo_release(mtr, dict_index_get_lock(cursor->index), MTR_MEMO_X_LOCK);
| | ==> //移動數據到新Page
| | ==> page_zip_copy_recs
| | ==> page_delete_rec_list_start
| | ==> //從老Page刪除數據
| | ==> page_delete_rec_list_end
MySQL 5.6 SMO過程分析
樂觀插入:
- 加index s 鎖(路徑不加鎖)
- 葉子節點加X鎖
- 釋放index Lock
- mod leaf
- 釋放葉子節點x鎖
悲觀插入:
- 加index x 鎖(路徑不加鎖)
- 葉子節點加X鎖
- mod tree
- 釋放index Lock
- mod leaf
- 釋放葉子節點x鎖
MySQL5.7 SMO分析
MySQL 5.7 SMO代碼分析
row_ins_clust_index_entry
| | ==> //樂觀插入
| | ==> err = row_ins_clust_index_entry_low(0, BTR_MODIFY_LEAF, index, n_uniq, entry, n_ext, thr);
| | ==> mtr_start(&mtr);
| |
| |
| | ==> //mode is BTR_MODIFY_LEAF or BTR_MODIFY_TREE
| | ==> btr_pcur_open(index, entry, PAGE_CUR_LE, mode, &pcur, &mtr);
| | ==> btr_pcur_open_low(dict_index_t* index, /*!< in: index */
ulint level, /*!< in: level in the btree */ (here is 0)
const dtuple_t* tuple, /*!< in: tuple on which search done */
page_cur_mode_t mode, /*!< in: PAGE_CUR_L, ...; */
ulint latch_mode,/*!< in: BTR_SEARCH_LEAF, ... */
btr_pcur_t* cursor, /*!< in: memory buffer for persistent cursor */
const char* file, /*!< in: file name */
ulint line, /*!< in: line where called */
mtr_t* mtr);
| | ==> //1300行的大函數 -_-!, //mode is BTR_MODIFY_LEAF or BTR_MODIFY_TREE
| | ==> btr_cur_search_to_nth_level(index, level, tuple, mode, latch_mode, btr_cursor, 0, file, line, mtr);
| | ==> //Store the position of the tree latch we push to mtr in memo stack for locks,
| | ==> //so that we know how to release it when we have latched leaf node(s)
| | ==> savepoint = mtr_set_savepoint(mtr);
| | ==> //如果是BTR_MODIFY_TREE,大概率加sx鎖,其他場景加s鎖:https://yq.aliyun.com/articles/41087
| | ==> mtr_sx_lock(dict_index_get_lock(index), mtr);
| | ==> upper_rw_latch = RW_X_LATCH;
| | ==> mtr_s_lock(dict_index_get_lock(index), mtr);
| | ==> upper_rw_latch = RW_S_LATCH;
| | ==> //獲取根節點的加鎖姿勢
| | ==> root_leaf_rw_latch = btr_cur_latch_for_root_leaf(latch_mode);
| | ==> //獲取根節點的邏輯地址
| | ==> page_id_t page_id(space, dict_index_get_page(index));
| | ==> Loop:
| | ==> rw_latch = RW_NO_LATCH;
| | ==> //如果是路徑節點或根節點,在(樂觀插入) or (目標是非葉子節點、且到達了該層)的場景下
| | ==> //對於root節點:rw_latch = RW_SX_LATCH; 對於其他節點rw_latch = upper_rw_latch;
| | ==> //如果是葉子節點,在樂觀插入場景下,rw_latch = latch_mode
| | ==> tree_savepoints[n_blocks] = mtr_set_savepoint(mtr);
| | ==> //對於悲觀插入,將page讀入bp時,如果page還未到達目標層,讀入時不加LATCH
| | ==> block = buf_page_get_gen(page_id, page_size, rw_latch, guess, buf_mode, file, line, mtr);
| | ==> //讀入的block維持在數組tree_blocks中
| | ==> tree_blocks[n_blocks] = block;
| | ==> //Btree只有一個Root節點時,有單獨處理邏輯
| | ==> upper_rw_latch = root_leaf_rw_latch;
| | ==> //一些加鎖之前的類似死鎖判斷
| | ==> buf_block_dbg_add_level
| | ==> //在root節點這層獲取樹高
| | ==> height = btr_page_get_level(page, mtr);
| | ==> //如果是葉子節點,且是悲觀插入,對當前page,及當前page的左右節點加上X LATCH
| | ==> btr_cur_latch_leaves
| | ==> //如果是葉子節點,且是樂觀插入,釋放索引鎖和路徑上的S鎖
| | ==> mtr_release_s_latch_at_savepoint(dict_index_get_lock(index))
| | ==> mtr_release_block_at_savepoint
| | ==> //page內查找
| | ==> page_cur_search_with_match_bytes
| | ==> //如果還未到達目標層
| | ==> if (level != height) {
| | ==> height--;
| | ==> //判斷當前Page是否會發送結構改變
| | ==> btr_cur_will_modify_tree
| | ==> //如果當前Page是安全的,釋放之前的Lock
| | ==> mtr_release_block_at_savepoint
| | ==> //如果下一次就是目標層,對路徑上Unsafe節點加X鎖
| | ==> mtr_block_x_latch_at_savepoint
| | ==> /* Go to the child node */
| | ==> page_id.reset(space, btr_node_ptr_get_child_page_no(node_ptr, offsets));
| | ==> n_blocks++;
| | ==> goto loop
| | ==> }
| | ==> //樂觀插入
| | ==> err = btr_cur_optimistic_insert(flags, &cursor, &offsets, &offsets_heap, entry, &insert_rec, &big_rec, n_ext, thr, &mtr)
| | ==> //判斷空間是否足夠
| | ==> dict_index_get_space_reserve
| | ==> mtr_commit(&mtr);
| | ==> //悲觀插入
| | ==> if (err == DB_FAIL)row_ins_clust_index_entry_low(0, BTR_MODIFY_TREE, index, n_uniq, entry, n_ext, thr)
| | ==> mtr_start(&mtr);
| |
| |
| | ==> //mode is BTR_MODIFY_LEAF or BTR_MODIFY_TREE
| | ==> btr_pcur_open(index, entry, PAGE_CUR_LE, mode, &pcur, &mtr);
| | ==> //嘗試樂觀插入
| | ==> err = btr_cur_optimistic_insert(flags, &cursor, &offsets, &offsets_heap,
entry, &insert_rec, &big_rec, n_ext, thr, &mtr);
| | ==> //悲觀插入
| | ==> err = btr_cur_pessimistic_insert(flags, &cursor, &offsets, &offsets_heap,
entry, &insert_rec, &big_rec, n_ext, thr, &mtr);
| | ==> btr_page_split_and_insert(flags, cursor, offsets, heap, entry, n_ext, mtr)
| | ==> //修改樹結構
| | ==> btr_attach_half_pages
| | ==> btr_insert_on_non_leaf_level(flags, index, level + 1, node_ptr_upper, mtr)
| | ==> 這裏是修改父節點,可能會出現進一步的SMO
| | ==> btr_insert_on_non_leaf_level_func
| | ==> //移動數據到新Page
| | ==> page_zip_copy_recs
| | ==> page_delete_rec_list_start
| | ==> //從老Page刪除數據
| | ==> page_delete_rec_list_end
MySQL 5.7 SMO過程分析
For 悲觀插入:
- 加Index SX Lock,在SMO上實現串行。
- 對Root 加SX Lock(貌似)
- 對路徑上可能unsafe的節點加x鎖
- 對葉子節點加x鎖
- Mod tree
- 釋放索引Lock
- mod leaf
- 釋放節點Lock和路徑Lock
For 樂觀插入:
- 加Index S Lock
- 對路徑加S Lock
- 對葉子節點加S or X Lock
- 釋放索引Lock和路徑Lock
- mod leaf
- 釋放節點lock
並行SMO
兩種情況:
計算節點快,存儲節點慢:Force Apply
計算節點慢,存儲節點快:版本號