MGR單主到底做不做衝突檢測?

和同事探討一個問題, MGR單主做不做衝突檢測.

我理解是不需要做的, 因爲已經明確只有主節點才能寫入數據了, 那麼必然不會有數據衝突的可能, 沒必要再做衝突檢測浪費性能了.

In single-primary mode, Group Replication enforces that only a single server writes to the group, so compared to multi-primary mode, consistency checking can be less strict and DDL statements do not need to be handled with any extra care

這裏less strict讓人很迷惑, 意思是還有衝突檢測唄, 但是和多主區別是啥沒說

之前看MGR的時候看過網易溫正湖的文章, 路上搜了下, 發現兩個文章:

MySQL MGR事務認證機制優化

MySQL事務在MGR中的漫遊記 - 事務認證

其實他文章哪些源碼我也看不懂. 我是不想別人說啥我就信啥, 所以想找到知識源頭

到家我搜了下conflict_detection_enable

搜到這個網站https://s0dev0mysql0com.icopy.site/doc/dev/mysql-server/latest/classCertifier.html

那麼這個網站源頭又是啥呢, 又搜了下

https://dev.mysql.com/doc/dev/mysql-server/latest/classCertifier.html
這裏面說的就很清楚了

就是說單主, 主庫掛了, 新主庫應用原主庫事務的時候才做衝突檢測

想起以前做過實驗壓測2 5.7MGR是否要應用完所有binlog纔會選舉出新主

實際上5.7官方文檔有描述

單主模式下:

當選擇一個新的主數據庫時，它只有在處理完所有來自舊主數據庫的事務後纔可寫。這樣可以避免舊的主事務中的舊事務與在該成員上執行的新事務之間可能發生的併發問題。在新的主數據庫重新路由客戶端應用程序之前，最好等待新的主數據庫應用其複製相關的中繼日誌。

When a new primary is elected, it is only writable once it has processed all of the transactions that came from the old primary. This avoids possible concurrency issues between old transactions from the old primary and the new ones being executed on this member. It is a good practice to wait for the new primary to apply its replication related relay-log before re-routing client applications to it.

https://dev.mysql.com/doc/refman/5.7/en/group-replication-single-primary-mode.html

在8.0文檔中是這樣寫的

選舉或任命新的主庫時，可能會積壓已應用於舊的主庫但尚未在此服務器上應用的更改。在這種情況下，直到新的主數據庫趕上舊的主數據庫，讀寫事務可能會導致衝突並回滾，而只讀事務可能會導致陳舊的讀取。

When a new primary is elected or appointed, it might have a backlog of changes that had been applied on the old primary but have not yet been applied on this server. In this situation, until the new primary catches up with the old primary, read-write transactions might result in conflicts and be rolled back, and read-only transactions might result in stale reads.

https://dev.mysql.com/doc/refman/8.0/en/group-replication-single-primary-mode.html

這其實很合理, 假設一個單主模式MGR集羣三個節點A, B, C

A是主庫, app向T1表插入三條數據, 主鍵值分別爲 1,2,3

B,C收到binlog event, 但還未應用, 此時A宕機, B當選爲新主庫, 那麼B需要應用在A產生的三個插入1,2,3. 如果沒有衝突檢測, 在B應用1,2,3前,業務有插入了新數據1,2,3, 那麼就明顯有問題, 所以此階段一定要做衝突檢測.

仔細看感覺5.7和8.0的描述有了"很大區別" 5.7說

When a new primary is elected, it is only writable once it has processed all of the transactions that came from the old primary. 新主必須應用完原主所有事物纔可寫

8.0說In this situation, until the new primary catches up with the old primary, read-write transactions might result in conflicts and be rolled back, and read-only transactions might result in stale reads.

在新主應用完原主所有事物前, 寫可能會衝突回滾, 而讀可能會讀到舊數據

我猜測這是說5.7單主是徹底關閉了衝突檢測, 新主應用原主完原主事務前是不可寫的, 通過不可寫避免了衝突, 還需要繼續做實驗測試, 新主應用原主完原主事務前, 是否可以執行不衝突的事務(比如我們像T1表寫大量數據製造transactions_behind, 新主當選後, 我們想T2表寫數據, 這明顯是不衝突的.)

那麼看來8.0比5.7有了改進, 在新主應用原主完原主事務期間開啓衝突檢測, 那麼按照上面的實驗例子, 業務就可以執行"不衝突的事務了"

但是是否這樣對業務來說是可接受的呢? 也許業務希望新主應用原主完原主事務後纔可寫是合理的, 所以有了下面的參數

8.0.14後增加了參數group_replication_consistency, 從根本上解決了讀舊數據的問題(寫操作無需設置參數也會等待應用完所有backlog纔可以執行)

BEFORE_ON_PRIMARY_FAILOVER

New RO or RW transactions with a newly elected primary that is applying backlog from the old primary are held (not applied) until any backlog has been applied. This ensures that when a primary failover happens, intentionally or not, clients always see the latest value on the primary. This guarantees consistency, but means that clients must be able to handle the delay in the event that a backlog is being applied. Usually this delay should be minimal, but does depend on the size of the backlog.

在發生切換時，連到新主的事務會被阻塞，等待先序提交的事務回放完成；這樣確保在故障切換時客戶端都能讀取到主服務器上的最新數據，保證了一致性

上面的描述不嚴謹, 出自知數堂田鵬的文章MySQL MGR"一致性讀寫"特性解讀. 事實上只讀事務也會等待, 除了以下只讀事務(參考此譯文https://cloud.tencent.com/developer/article/1478455)

SHOW commands
SET option
DO
EMPTY
USE
SELECTing from performance_schema database
SELECTing from table PROCESSLIST on database infoschema
SELECTing from sys database
SELECT command that don’t use tables
SELECT command that don’t execute user defined functions
STOP GROUP_REPLICATION command
SHUTDOWN command
RESET PERSIST

個人理解如果設置BEFORE_ON_PRIMARY_FAILOVER雖然會保證一致性, 如果節點新主與原主延遲過大, 新主應用差異日誌時間過長, 那麼會導致大量連接進來處於等待狀態, 導致Threads_running暴漲, 甚至連接數打滿新主崩潰

至8.0.18MGR選主算法是

1.選版本最小的

2.選權重最大的

3.選uuid排序最小的

所以並沒有判斷哪個節點延遲最小

https://dev.mysql.com/doc/refman/8.0/en/group-replication-single-primary-mode.html

那麼多主如何處理?

從參數說明上來看BEFORE_ON_PRIMARY_FAILOVER應該只是針對單主, 所以除非應用顯示指定了group_replication_consistency, 否則多主還是會讀到舊數據.

對於寫入, 因爲多主是要做衝突檢測的, 所以我們假設一個場景

多主MGR, N1,N2,N3, 單寫N1, T1(id int primary key, sname varchar(10))表插入

1,fan GTID: GROUP_UUID:1
2,bo GTID: GROUP_UUID:2
3,shi GTID: GROUP_UUID:3

目前GTID: GROUP_UUID:1-3

N1宕機, N2應用到1, 未執行2,3. 此時client像N2插入(2,hehe) 那麼此時N2這個插入版本是GROUP_UUID:1

而衝突檢測數據庫中id=2這一行的版本是GROUP_UUID:1-2, 所以這條插入會衝突檢測失敗回滾掉

以上是我的理解, 目前沒有測試

MGR單主到底做不做衝突檢測?

DAPPER 事務 TRANSACTION

MGR參數之group_replication_ip_whitelist

ProxySQL備份策略

MaoXian web clipper本地程序在macOS Catalina報錯DisconnectErr:Native host has exited.

使用python消費canal protobuf格式數據

0.58 MHA 基於GTID的恢復不會從原Master拉取差異日誌且不再需要relay_log_purge=0!

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結