MySQL InnoDB 鎖總結(二)- RC 加鎖流程

在之前InooDB 加鎖總結的文章中,討論了大量在 RR 情況下 MySQL 如何加鎖的案例。而這篇相較於前一篇,更偏重於實踐,主要是在遇到鎖等待或者死鎖時,如何分析和解決問題。這篇講解的案例全都基於 RC 隔離級別,MySQL 版本爲 5.7.x.

我們知道,RC 相較於 RR 在很大程度上調高了併發性,降低了死鎖發生的概率,因而作爲大多數高併發場景的首選。

但是降低並不代表消除,如果設計的索引或者語句的寫法不當,依舊會產生死鎖等問題。在這篇文章中,將會圍繞着一個實際案例進行討論。

假設在數據庫中有這樣一張表結構:其中 ID 爲主鍵索引,其餘字段都沒有索引。表中共有 6 條數據,對 id 有個印象,在後續分析時會用到。

mysql> desc device_management_service_mapping;
+---------------+--------------+------+-----+---------+----------------+
| Field         | Type         | Null | Key | Default | Extra          |
+---------------+--------------+------+-----+---------+----------------+
| id            | int(11)      | NO   | PRI | NULL    | auto_increment |
| dst_device_id | int(11)      | YES  |     | NULL    |                |
| dst_ip        | varchar(255) | YES  |     | NULL    |                |
| ipp_type      | varchar(255) | YES  |     | NULL    |                |
| operation_id  | int(11)      | NO   |     | NULL    |                |
| packets       | int(11)      | YES  |     | NULL    |                |
| src_device_id | int(11)      | NO   |     | NULL    |                |
| src_ip        | varchar(255) | YES  |     | NULL    |                |
| type          | varchar(255) | YES  |     | NULL    |                |
| created_at    | datetime(6)  | YES  |     | NULL    |                |
| updated_at    | datetime(6)  | YES  |     | NULL    |                |
| description   | varchar(256) | YES  |     | NULL    |                |
+---------------+--------------+------+-----+---------+----------------+

mysql> SELECT id, src_device_id, operation_id FROM device_management_service_mapping;
+----+---------------+--------------+
| id | src_device_id | operation_id |
+----+---------------+--------------+
| 85 |            13 |        10001 |
| 86 |            13 |        10002 |
| 87 |             1 |        10001 |
| 88 |             1 |        10002 |
| 89 |             3 |        10001 |
| 90 |             3 |        10002 |
+----+---------------+--------------+

需要關注的僅是 id, src_device_id, operation_id 這三個字段,下面的案例將圍繞這三個字段展開,分別討論:

  • 在沒有索引的情況下,RC 的加鎖過程。
  • 在有二級索引的情況下,RC 的加鎖過程。
  • 以及 RC 如何通過 semi-consistant 提高併發。

準備步驟:

在分析案例前,需要收集一些日誌信息,便於我們排錯:

  1. 打開 InooDB 鎖日誌:
show variables like 'innodb_status_output';
SET GLOBAL innodb_status_output=ON;

show variables like 'innodb_status_output';
SET GLOBAL innodb_status_output_locks=ON;

show variables like '%tx_isolation%';
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
show engine innodb status\G;
  1. 打開 SQL 記錄,用於分析 transcation.
SHOW VARIABLES LIKE "general_log%";
SET GLOBAL general_log = 'ON';

案例一:無索引加鎖情況

Session A Session B
begin;
SELECT * FROM device_management_service_mapping where src_device_id=1 AND operation_id=10001 FOR UPDATE; begin;
Query ok. SELECT * FROM device_management_service_mapping where src_device_id=13 AND operation_id=10001 FOR UPDATE;
block.
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

由於這裏的 src_device_id 和 operation_id 均沒有索引,所以我們推測在 Session A 和 Session B 執行時,加鎖的過程採用的是全表掃描的方式。

mysql> explain SELECT * FROM device_management_service_mapping where src_device_id=13 AND operation_id=10001 FOR UPDATE;
+----+-------------+-----------------------------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table                             | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-----------------------------------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | device_management_service_mapping | ALL  | NULL          | NULL | NULL    | NULL |    6 | Using where |
+----+-------------+-----------------------------------+------+---------------+------+---------+------+------+-------------+

接着來分析下加鎖過程:

Session A 執行 SELECT 語句成功後,對應加鎖範圍是:

---TRANSACTION 873007, ACTIVE 3 sec
2 lock struct(s), heap size 360, 1 row lock(s)
MySQL thread id 27209, OS thread handle 0x7fecd45e8700, query id 5258428 10.124.206.88 root
TABLE LOCK table `ipsla`.`device_management_service_mapping` trx id 873007 lock mode IX
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 873007 lock_mode X locks rec but not gap
Record lock, heap no 4 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
 0: len 4; hex 80000057; asc    W;;
 1: len 6; hex 0000000d503d; asc     P=;;
 2: len 7; hex 3d000001ae0694; asc =      ;;
 3: len 4; hex 80000003; asc     ;;
 4: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
 5: len 4; hex 49505030; asc IPP0;;
 6: len 4; hex 80002711; asc   ' ;;
 7: len 4; hex 800000c8; asc     ;;
 8: len 4; hex 80000001; asc     ;;
 9: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
 10: len 8; hex 696e7465726e6574; asc internet;;
 11: len 8; hex 99a812da1a000000; asc         ;;
 12: SQL NULL;
 13: len 1; hex 31; asc 1;;

可以看到,雖然是全表掃描,但在語句執行後,並沒有鎖住所有行。這是因爲在 RC 級別下,在搜索過程中會對所有行加鎖,之後在找到對應的記錄後,會釋放不符合條件的行。所以僅僅鎖住了 id=87 的行。

接着 Session B 執行了 SELECT 語句,然後被阻塞,對應加鎖範圍是:

---TRANSACTION 873008, ACTIVE 14 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 360, 2 row lock(s)
MySQL thread id 27208, OS thread handle 0x7fecd4522700, query id 5258431 10.124.206.88 root Sending data
SELECT * FROM device_management_service_mapping where src_device_id=13 AND operation_id=10001 FOR UPDATE
------- TRX HAS BEEN WAITING 14 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 873008 lock_mode X locks rec but not gap waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
 0: len 4; hex 80000057; asc    W;;
 1: len 6; hex 0000000d503d; asc     P=;;
 2: len 7; hex 3d000001ae0694; asc =      ;;
 3: len 4; hex 80000003; asc     ;;
 4: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
 5: len 4; hex 49505030; asc IPP0;;
 6: len 4; hex 80002711; asc   ' ;;
 7: len 4; hex 800000c8; asc     ;;
 8: len 4; hex 80000001; asc     ;;
 9: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
 10: len 8; hex 696e7465726e6574; asc internet;;
 11: len 8; hex 99a812da1a000000; asc         ;;
 12: SQL NULL;
 13: len 1; hex 31; asc 1;;

------------------
TABLE LOCK table `ipsla`.`device_management_service_mapping` trx id 873008 lock mode IX
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 873008 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
 0: len 4; hex 80000055; asc    U;;
 1: len 6; hex 0000000d5032; asc     P2;;
 2: len 7; hex 37000003351908; asc 7   5  ;;
 3: len 4; hex 80000001; asc     ;;
 4: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
 5: len 4; hex 49505030; asc IPP0;;
 6: len 4; hex 80002711; asc   ' ;;
 7: len 4; hex 800000c8; asc     ;;
 8: len 4; hex 8000000d; asc     ;;
 9: len 12; hex 31302e3132342e302e313539; asc 10.124.0.159;;
 10: len 8; hex 696e7465726e6574; asc internet;;
 11: len 8; hex 99a812da14000000; asc         ;;
 12: SQL NULL;
 13: len 2; hex 3131; asc 11;;

RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 873008 lock_mode X locks rec but not gap waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
 0: len 4; hex 80000057; asc    W;;
 1: len 6; hex 0000000d503d; asc     P=;;
 2: len 7; hex 3d000001ae0694; asc =      ;;
 3: len 4; hex 80000003; asc     ;;
 4: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
 5: len 4; hex 49505030; asc IPP0;;
 6: len 4; hex 80002711; asc   ' ;;
 7: len 4; hex 800000c8; asc     ;;
 8: len 4; hex 80000001; asc     ;;
 9: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
 10: len 8; hex 696e7465726e6574; asc internet;;
 11: len 8; hex 99a812da1a000000; asc         ;;
 12: SQL NULL;
 13: len 1; hex 31; asc 1;;

---TRANSACTION 873007, ACTIVE 41 sec
2 lock struct(s), heap size 360, 1 row lock(s)
MySQL thread id 27209, OS thread handle 0x7fecd45e8700, query id 5258428 10.124.206.88 root
TABLE LOCK table `ipsla`.`device_management_service_mapping` trx id 873007 lock mode IX
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 873007 lock_mode X locks rec but not gap
Record lock, heap no 4 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
 0: len 4; hex 80000057; asc    W;;
 1: len 6; hex 0000000d503d; asc     P=;;
 2: len 7; hex 3d000001ae0694; asc =      ;;
 3: len 4; hex 80000003; asc     ;;
 4: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
 5: len 4; hex 49505030; asc IPP0;;
 6: len 4; hex 80002711; asc   ' ;;
 7: len 4; hex 800000c8; asc     ;;
 8: len 4; hex 80000001; asc     ;;
 9: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
 10: len 8; hex 696e7465726e6574; asc internet;;
 11: len 8; hex 99a812da1a000000; asc         ;;
 12: SQL NULL;
 13: len 1; hex 31; asc 1;;

原來 Session A 對應的是 thread id 27209,加鎖範圍沒有任何變化。

着重分析 Session B mySQL thread id 27208.

Session B 由於沒有索引,執行全表掃描。從 id=85 開始,這裏由於 Session A 僅對 id=87 的行加上了寫鎖。所以 Session B 是可以獲取 id=85 的 X 鎖的,id=86 同理,由於不符合過濾條件,加鎖後又被釋放。接着遍歷到 id=87, 出現了鎖等待,被阻塞。

日誌中這兩點都可以證明,

------- **TRX HAS BEEN WAITING 14 SEC FOR THIS LOCK TO BE GRANTED:**
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 873008 lock_mode X locks rec but not gap waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
 0: len 4; hex 80000057; asc    W;;
 
 RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 873008 **lock_mode X locks rec but not gap waiting**
Record lock, heap no 4 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
 0: len 4; hex 80000057; asc    W;;

需要注意的是,在加鎖的過程中,無論找沒找到符和的行,都需要從第一行開始,一直到最後一行,因爲沒有索引,需要進行全表掃描。在搜索結束後,會將不符合條件的行進行釋放。

案例二:無索引加鎖,造成死鎖

Session A Session B
begin:
SELECT * FROM device_management_service_mapping where src_device_id=3 AND operation_id=10001 FOR UPDATE; begin;
Query ok.
UPDATE device_management_service_mapping SET description='test' WHERE id=89;
SELECT * FROM device_management_service_mapping where src_device_id=13 AND operation_id=10001 FOR UPDATE;
block;
SELECT * FROM device_management_service_mapping where src_device_id=3 AND operation_id=10002 FOR UPDATE;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction.
Query ok.

分析下過程:

  1. 首先 Session A 執行 SELECT * FROM table FOR UPDATE. 由於這裏 src_device_id 和 operation_id 沒有索引,會走全表掃描。會把主鍵索引所有的行加上 X 鎖,在查詢結束後,僅持有id=89的行鎖。
  2. 接着對 id=89 的數據也就是(src_device_id=3 AND operation_id=10001)進行更新。
  3. 然後 Session B 需要對 src_device_id=13 AND operation_id=10001 進行查找,同樣需要走全表掃描,期望爲所有主鍵索引加上 X 寫鎖。但這時 id=89 已經被 Session A 持有,所以被阻塞。可能會問,明明第一行的數據已經找到了,爲什麼不停止搜索,這是因爲沒有索引,需要進行全表掃描。此時 Session B 持有的鎖是 id=85 的寫鎖及其期待索引 id=89 的寫鎖,進而被阻塞。
  4. 緊接着,Session A 又執行了一次 For Update 語句,需要重新全表掃描。從 id=85 開始申請 X 鎖,但由於已經被 Session B 鎖持有,形成阻塞狀態,等待 Session A 執行完成。
  5. 這樣就形成了死鎖,Session B 等待 Session A 釋放 X 鎖(id=89),Session A 又等待 Session B 釋放 id=85 的 X 鎖。進而拋出死鎖異常。

接着,詳細分析下死鎖的日誌:

------------------------
LATEST DETECTED DEADLOCK
------------------------
2020-12-09 13:30:34 7fecdd3eb700
*** (1) TRANSACTION:
TRANSACTION 872502, ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 360, 2 row lock(s)
MySQL thread id 27234, OS thread handle 0x7fec0feb5700, query id 5255123 10.124.207.150 root Sending data
SELECT `device_management_service_mapping`.`id`, `device_management_service_mapping`.`dst_device_id`, `device_management_se_management_service_mapping`.`operation_id`, `device_management_service_mapping`.`packets`, `device_management_service_management_service_mapping`.`type`, `device_management_service_mapping`.`created_at`, `device_management_service_mapping`.`uement_service_mapping` WHERE (`device_management_service_mapping`.`operation_id` = 10001 AND `device_management_service_ma
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id
Record lock, heap no 6 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
 0: len 4; hex 80000059; asc    Y;;
 1: len 6; hex 0000000d5035; asc     P5;;
 2: len 7; hex 38000002731b5e; asc 8   s ^;;
 3: len 4; hex 80000001; asc     ;;
 4: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
 5: len 4; hex 49505030; asc IPP0;;
 6: len 4; hex 80002711; asc   ' ;;
 7: len 4; hex 800000c8; asc     ;;
 8: len 4; hex 80000003; asc     ;;
 9: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
 10: len 8; hex 696e7465726e6574; asc internet;;
 11: len 8; hex 99a812da19000000; asc         ;;
 12: SQL NULL;
 13: SQL NULL;

*** (2) TRANSACTION:
TRANSACTION 872501, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1
MySQL thread id 27233, OS thread handle 0x7fecdd3eb700, query id 5255127 10.124.207.150 root Sending data
SELECT `device_management_service_mapping`.`id`, `device_management_service_mapping`.`dst_device_id`, `device_management_se_management_service_mapping`.`operation_id`, `device_management_service_mapping`.`packets`, `device_management_service_management_service_mapping`.`type`, `device_management_service_mapping`.`created_at`, `device_management_service_mapping`.`uement_service_mapping` WHERE (`device_management_service_mapping`.`operation_id` = 10002 AND `device_management_service_ma
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id
Record lock, heap no 6 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
 0: len 4; hex 80000059; asc    Y;;
 1: len 6; hex 0000000d5035; asc     P5;;
 2: len 7; hex 38000002731b5e; asc 8   s ^;;
 3: len 4; hex 80000001; asc     ;;
 4: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
 5: len 4; hex 49505030; asc IPP0;;
 6: len 4; hex 80002711; asc   ' ;;
 7: len 4; hex 800000c8; asc     ;;
 8: len 4; hex 80000003; asc     ;;
 9: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
 10: len 8; hex 696e7465726e6574; asc internet;;
 11: len 8; hex 99a812da19000000; asc         ;;
 12: SQL NULL;
 13: SQL NULL;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id
Record lock, heap no 2 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
 0: len 4; hex 80000055; asc    U;;
 1: len 6; hex 0000000d5032; asc     P2;;
 2: len 7; hex 37000003351908; asc 7   5  ;;
 3: len 4; hex 80000001; asc     ;;
 4: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
 5: len 4; hex 49505030; asc IPP0;;
 6: len 4; hex 80002711; asc   ' ;;
 7: len 4; hex 800000c8; asc     ;;
 8: len 4; hex 8000000d; asc     ;;
 9: len 12; hex 31302e3132342e302e313539; asc 10.124.0.159;;
 10: len 8; hex 696e7465726e6574; asc internet;;
 11: len 8; hex 99a812da14000000; asc         ;;
 12: SQL NULL;
 13: len 2; hex 3131; asc 11;;

*** WE ROLL BACK TRANSACTION (1)

可以看到顯示有兩個事務,(1) TRANSACTION 和 (2) TRANSACTION,對應使用的 Thread id 是 27234 和 27233.

先看 (1) TRANSACTION:

(1) WAITING FOR THIS LOCK TO BE GRANTED: 表示處於阻塞狀態,等待加鎖。

想要加上的鎖類型爲:Record lock, heap no 6 PHYSICAL RECORD: n_fields 14; compact format; info bits 0 對應就是 X 鎖。

加鎖的對象 id 爲 0: len 4; hex 80000055; asc U;; id=85 的對象。

再看 (2) TRANSACTION:

*** (2) HOLDS THE LOCK(S): 表示目前持有的鎖。

Record lock, heap no 6 PHYSICAL RECORD: n_fields 14; compact format; info bits 0 加鎖的範圍是 X 鎖。

持有鎖的對象是:0: len 4; hex 80000059; asc Y;; 爲 id=89 的對象。

*** (2) WAITING FOR THIS LOCK TO BE GRANTED: 表示想要加鎖,目前被阻塞。同樣想加 X 鎖。

想加鎖的對象爲 0: len 4; hex 80000055; asc U;; id=85.

進而產生死鎖,MySQL 採取的方案是 rollback TRANSACTION (1), 讓事務2 繼續執行。

對應到上面的例子,事務2 就是 Session A,事務 1 是 Session B. 最後的結果就是 Session A 執行成功,Session B 被回滾。

那麼有一個問題,爲什麼 TRANSACTION (1) 想要鎖的對象是 id=85 的行呢?TRANSACTION (2) 的第二步爲什麼也想要鎖住 id=85 的行呢?

原因就在於 operation_id 和 src_device_id 都是沒有索引的,如果想要加鎖的話,都需要從第一行 id=85 的行開始,進行全表掃描。

可見在 RC 情況下雖然已經減少了鎖的類型和範圍,但如果沒對合適的字段設置索引,依然很容易出現死鎖的情況。

案例三:半一致性讀,提高 RC 併發

先看下官網給的定義,在 RC 級別下:

  • 對於 UPDATE 或者 DELETE 操作來說,InnoDB 僅僅會鎖住更新或者刪除的行。在 MySQL 根據 Where 條件,搜索後,不滿足條件的行會被釋放。這樣做可以很好地降低死鎖發生的概率,但仍然可以發生(比如案例二的例子)。

  • 對於 UPDATE 操作來說,在 RC 級別下,如果一個行被鎖上後,InooDB 會執行半一致性讀的操作,通過返回最近的 commit 版本,來判斷當前鎖定的行是否符合 WHERE 條件。如果不匹配,不會對該記錄加鎖,如果匹配,會再次讀取該行進行加鎖或者阻塞來等待鎖定該行。

來看一個具體的例子:

# 初始化一張表 t,
CREATE TABLE t (a INT NOT NULL, b INT) ENGINE = InnoDB;
INSERT INTO t VALUES (1,2),(2,3),(3,2),(4,3),(5,2);
COMMIT;

注意 a 和 b 都沒有索引,在搜索時,會使用隱藏的聚簇索引(主鍵索引)進行搜索。

假設有這樣兩個 Session

Session A Session B
START TRANSACTION;
UPDATE t SET b = 5 WHERE b = 3; START TRANSACTION;
UPDATE t SET b = 4 WHERE b = 2;

對於 Session A 來說:會對全表的每一行進行加鎖,然後在找到匹配的行後,釋放其他不匹配的行的鎖。

x-lock(1,2); unlock(1,2) # 釋放鎖
x-lock(2,3); update(2,3) to (2,5); retain x-lock # 持有鎖
x-lock(3,2); unlock(3,2) # 釋放鎖
x-lock(4,3); update(4,3) to (4,5); retain x-lock # 持有鎖
x-lock(5,2); unlock(5,2) # 釋放鎖

對於 Session B 來說:InooDB 會進行 semi-consistent 讀(半一致性),首先回當前每一行的最近提交版本。然後通過 WHERE 條件判斷需要更新的行是否能被鎖上。發現 (1,2), (3,2), (5,2) 都可以獲取到鎖進行更新。

而對於(2,3) 和 (4,3) 這兩條記錄,由於不符合 WHERE 條件,進而對其不加鎖,意味着和 Session A 持有的鎖並不衝突,進而可以正常更新。

是 (2,3) 和 (4,3) 是因爲 Session A 並沒提交。

x-lock(1,2); update(1,2) to (1,4); retain x-lock
x-lock(2,3); unlock(2,3) # 釋放鎖 
x-lock(3,2); update(3,2) to (3,4); retain x-lock
x-lock(4,3); unlock(4,3) # 釋放鎖
x-lock(5,2); update(5,2) to (5,4); retain x-lock

還記着案例一中,兩條 FOR UPDATE 出現時,後面的被阻塞的例子嗎。這裏沒有被阻塞,就是利用半一致性讀對 UPDATE 操作做的優化,從而提高併發性。

這裏再看另外一種情況:

CREATE TABLE t (a INT NOT NULL, b INT, c INT, INDEX (b)) ENGINE = InnoDB;
INSERT INTO t VALUES (1,2,3),(2,2,4);
COMMIT;

# Session A
START TRANSACTION;
UPDATE t SET b = 3 WHERE b = 2 AND c = 3;

# Session B
UPDATE t SET b = 4 WHERE b = 2 AND c = 4;

這裏對 b 加上了一條二級索引後,結果就不一樣了,半一致性讀的效果就不能再生效。Session B 操作會被阻塞。

首先,InooDB 會根據 WHERE 條件找到 b 的索引樹,然對 b=2 這行記錄加鎖。

然後 Session B 也會根據 b 的索引樹,對 b=2 的每一行記錄加鎖,但在加鎖過程中發現,由於 Session A 已經持有b=2的記錄鎖,所以被阻塞。

也就是說半一致性讀在這裏沒有生效,僅會對聚簇索引(主鍵索引)生效。

半一致性讀僅發生在 RC 或者開啓 innodb_locks_unsafe_for_binlog 的情況下。

案例四:非唯一索引加鎖情況

Session A Session B
begin;
SELECT * FROM device_management_service_mapping where src_device_id=1 FOR UPDATE; begin;
Query ok. SELECT * FROM device_management_service_mapping where src_device_id=13 FOR UPDATE;
Query ok.

現在對 src_device_id 設置了二級索引,現在重新來看下加鎖情況。

執行 Session A 後, 如下圖,對 id=87 和 id=99 的主鍵索引加了 X 鎖,對二級索引 src_device_id=1 的兩條記錄加了寫鎖。

---TRANSACTION 912995, ACTIVE 8 sec
3 lock struct(s), heap size 360, 4 row lock(s)
MySQL thread id 33924, OS thread handle 0x7fec0fe31700, query id 5483145 10.124.206.88 root
TABLE LOCK table `ipsla`.`device_management_service_mapping` trx id 912995 lock mode IX
RECORD LOCKS space id 166 page no 6 n bits 80 index `device_management_service_mapping_src_device_id_84c09d1d` of table `ipsla`.`device_management_service_mapping` trx id 912995 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 4; hex 80000001; asc     ;;
 1: len 4; hex 80000058; asc    X;;

Record lock, heap no 8 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 4; hex 80000001; asc     ;;
 1: len 4; hex 80000057; asc    W;;

RECORD LOCKS space id 166 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 912995 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 15; compact format; info bits 0
 0: len 4; hex 80000058; asc    X;;
 1: len 6; hex 0000000decd0; asc       ;;
 2: len 7; hex ab0000026d0110; asc     m  ;;
 3: len 4; hex 8000000d; asc     ;;
 4: len 12; hex 31302e3132342e302e313539; asc 10.124.0.159;;
 5: len 4; hex 49505030; asc IPP0;;
 6: len 4; hex 80002712; asc   ' ;;
 7: len 4; hex 800000c8; asc     ;;
 8: len 4; hex 80000001; asc     ;;
 9: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
 10: len 8; hex 696e7465726e6574; asc internet;;
 11: len 8; hex 99a814b22d000000; asc     -   ;;
 12: SQL NULL;
 13: SQL NULL;
 14: len 7; hex 312c3130303032; asc 1,10002;;

Record lock, heap no 8 PHYSICAL RECORD: n_fields 15; compact format; info bits 0
 0: len 4; hex 80000057; asc    W;;
 1: len 6; hex 0000000deccd; asc       ;;
 2: len 7; hex a90000015f0110; asc     _  ;;
 3: len 4; hex 80000003; asc     ;;
 4: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
 5: len 4; hex 49505030; asc IPP0;;
 6: len 4; hex 80002711; asc   ' ;;
 7: len 4; hex 800000c8; asc     ;;
 8: len 4; hex 80000001; asc     ;;
 9: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
 10: len 8; hex 696e7465726e6574; asc internet;;
 11: len 8; hex 99a814b22d000000; asc     -   ;;
 12: SQL NULL;
 13: SQL NULL;
 14: len 7; hex 312c3130303031; asc 1,10001;;

執行 Session B,關注比較靠前事務,發現把 id 爲 85 和 86 的主鍵索引加上了寫鎖,對二級索引 13 的兩條記錄加上了 X 鎖。

---TRANSACTION 913004, ACTIVE 3 sec
3 lock struct(s), heap size 360, 4 row lock(s)
MySQL thread id 33925, OS thread handle 0x7fec0feb5700, query id 5483176 10.124.206.88 root
TABLE LOCK table `ipsla`.`device_management_service_mapping` trx id 913004 lock mode IX
RECORD LOCKS space id 166 page no 6 n bits 80 index `device_management_service_mapping_src_device_id_84c09d1d` of table `ipsla`.`device_management_service_mapping` trx id 913004 lock_mode X locks rec but not gap
Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 4; hex 8000000d; asc     ;;
 1: len 4; hex 80000055; asc    U;;

Record lock, heap no 6 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 4; hex 8000000d; asc     ;;
 1: len 4; hex 80000056; asc    V;;

RECORD LOCKS space id 166 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 913004 lock_mode X locks rec but not gap
Record lock, heap no 6 PHYSICAL RECORD: n_fields 15; compact format; info bits 0
 0: len 4; hex 80000056; asc    V;;
 1: len 6; hex 0000000decdb; asc       ;;
 2: len 7; hex b1000001930110; asc        ;;
 3: len 4; hex 80000003; asc     ;;
 4: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
 5: len 4; hex 49505030; asc IPP0;;
 6: len 4; hex 80002712; asc   ' ;;
 7: len 4; hex 800000c8; asc     ;;
 8: len 4; hex 8000000d; asc     ;;
 9: len 12; hex 31302e3132342e302e313539; asc 10.124.0.159;;
 10: len 8; hex 696e7465726e6574; asc internet;;
 11: len 8; hex 99a814b227000000; asc     '   ;;
 12: SQL NULL;
 13: SQL NULL;
 14: len 8; hex 31332c3130303032; asc 13,10002;;

Record lock, heap no 10 PHYSICAL RECORD: n_fields 15; compact format; info bits 0
 0: len 4; hex 80000055; asc    U;;
 1: len 6; hex 0000000decd8; asc       ;;
 2: len 7; hex af000001650110; asc     e  ;;
 3: len 4; hex 80000001; asc     ;;
 4: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
 5: len 4; hex 49505030; asc IPP0;;
 6: len 4; hex 80002711; asc   ' ;;
 7: len 4; hex 800000c8; asc     ;;
 8: len 4; hex 8000000d; asc     ;;
 9: len 12; hex 31302e3132342e302e313539; asc 10.124.0.159;;
 10: len 8; hex 696e7465726e6574; asc internet;;
 11: len 8; hex 99a814b227000000; asc     '   ;;
 12: SQL NULL;
 13: SQL NULL;
 14: len 8; hex 31332c3130303031; asc 13,10001;;

---TRANSACTION 912995, ACTIVE 99 sec
3 lock struct(s), heap size 360, 4 row lock(s)
MySQL thread id 33924, OS thread handle 0x7fec0fe31700, query id 5483145 10.124.206.88 root
TABLE LOCK table `ipsla`.`device_management_service_mapping` trx id 912995 lock mode IX
RECORD LOCKS space id 166 page no 6 n bits 80 index `device_management_service_mapping_src_device_id_84c09d1d` of table `ipsla`.`device_management_service_mapping` trx id 912995 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 4; hex 80000001; asc     ;;
 1: len 4; hex 80000058; asc    X;;

Record lock, heap no 8 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 4; hex 80000001; asc     ;;
 1: len 4; hex 80000057; asc    W;;

RECORD LOCKS space id 166 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 912995 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 15; compact format; info bits 0
 0: len 4; hex 80000058; asc    X;;
 1: len 6; hex 0000000decd0; asc       ;;
 2: len 7; hex ab0000026d0110; asc     m  ;;
 3: len 4; hex 8000000d; asc     ;;
 4: len 12; hex 31302e3132342e302e313539; asc 10.124.0.159;;
 5: len 4; hex 49505030; asc IPP0;;
 6: len 4; hex 80002712; asc   ' ;;
 7: len 4; hex 800000c8; asc     ;;
 8: len 4; hex 80000001; asc     ;;
 9: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
 10: len 8; hex 696e7465726e6574; asc internet;;
 11: len 8; hex 99a814b22d000000; asc     -   ;;
 12: SQL NULL;
 13: SQL NULL;
 14: len 7; hex 312c3130303032; asc 1,10002;;

Record lock, heap no 8 PHYSICAL RECORD: n_fields 15; compact format; info bits 0
 0: len 4; hex 80000057; asc    W;;
 1: len 6; hex 0000000deccd; asc       ;;
 2: len 7; hex a90000015f0110; asc     _  ;;
 3: len 4; hex 80000003; asc     ;;
 4: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
 5: len 4; hex 49505030; asc IPP0;;
 6: len 4; hex 80002711; asc   ' ;;
 7: len 4; hex 800000c8; asc     ;;
 8: len 4; hex 80000001; asc     ;;
 9: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
 10: len 8; hex 696e7465726e6574; asc internet;;
 11: len 8; hex 99a814b22d000000; asc     -   ;;
 12: SQL NULL;
 13: SQL NULL;
 14: len 7; hex 312c3130303031; asc 1,10001;;

這裏比較特殊的是,Session B 在加鎖時並沒有被阻塞,原因在於 Session 先通過二級索引,進行樹搜索找到 src_device_id=13 的記錄。然後在此記錄上開始進行遍歷操作,也就是會加鎖。

首先,第一個加鎖的對象是 src_device_id=13, id = 85 的記錄,由於並不是唯一索引,所以會繼續遍歷,給 src_device_id=13, id=86 的記錄加鎖。然後接着遍歷,找到 id=87, src_device_id=1 的記錄。發現不滿足條件,就此結束。

其實在剛開始寫這個例子時,第一想法是 Session B 會阻塞,原因在於在遍歷到 id=87, src_device_id=1 時,Session A 已經寫了 X 寫鎖,此時 Session B 應該無法讀取,估計是 MySQL 做了優化,允許讀取,並發現該值不匹配到 Where 條件的值,接着釋放了。

死鎖分析流程

下面來簡單總結在,在死鎖等情況出現時,如果排查故障:

  1. 抓出 SQL 日誌,結合 Thread id 分析日誌執行情況,簡單寫了個分析腳本,會把相同 Thread 的執行過程打印出來。
raw_str = """
201209 13:30:22 27225 Connect   [email protected] on ipsla
                27225 Query     SET autocommit=0
                27225 Query     SET autocommit=1
                27225 Query     SET SESSION TRANSACTION ISOLATI
...............................
"""

lines = raw_str.split('\n')
number_dict = {}
for line in lines:
    number = re.search(r'\s(\d\d\d\d\d)\s', line)
    if number:
        # print(number.group())
        number_dict[number.group()] = []
# print(number_dict)

for line in lines:
    number = re.search(r'\s(\d\d\d\d\d)\s', line)
    if number:
        if number.group() in number_dict:
            number_dict[number.group()].append(line)

for key in number_dict:
    for line in number_dict[key]:
        print(line)
    print('---------------- new Thread -------------------------------')
  1. 根據 show engine innodb status\G; 查到的死鎖 thread id 和鎖信息,對應到分析後的文件中,得出執行過程。
  2. 進行復現, 得出結論,做出優化。

參考

半一致性讀

半一致性讀例子

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章