提出問題
我們當前通過5個DM任務從RDS MySQL向TiDB同步數據。這些任務均非合庫合表任務,且同步的庫表相互之間沒有交集,safe-mode均未顯式打開,Syncer線程數16。且除DM任務外,幾乎沒有其他寫入動作。
同步開始後,通過Grafana的TiDB/KV Errors面板觀察到持續的寫衝突,如下圖所示。
同時AlertManager出現大量關於事務重試的報警。
嘗試逐個停止DM任務,發現一旦停止其中流量最大的那一個(接入40+張表,數千QPS),寫衝突即消失。
分析問題
查看TiDB Server的日誌,其中頻繁打印prewrite encounters lock
,表示預寫階段有鎖衝突,並且DM Syncer寫入採用的是樂觀事務。
[2021/02/08 15:47:40.112 +08:00] [INFO] [2pc.go:822] ["prewrite encounters lock"] [conn=0] [lock="key: {metaKey=true, key=DB:820, field=TID:864}, primary: {metaKey=true, key=DB:820, field=TID:864}, txnStartTS: 422778099492716553, lockForUpdateTS:0, ttl: 3001, type: Put"]
[2021/02/08 15:47:40.114 +08:00] [WARN] [txn.go:66] [RunInNewTxn] ["retry txn"=422778099492716555] ["original txn"=422778099492716555] [error="[kv:9007]Write conflict, txnStartTS=422778099492716555, conflictStartTS=422778099492716553, conflictCommitTS=422778099492716557, key={metaKey=true, key=DB:820, field=TID:864} primary={metaKey=true, key=DB:820, field=TID:864} [try again later]"]
[2021/02/08 15:47:40.116 +08:00] [INFO] [2pc.go:1336] ["2PC clean up done"] [txnStartTS=422778099492716555]
[2021/02/08 15:47:40.126 +08:00] [WARN] [txn.go:66] [RunInNewTxn] ["retry txn"=422778099492716568] ["original txn"=422778099492716568] [error="[kv:9007]Write conflict, txnStartTS=422778099492716568, conflictStartTS=422778099492716567, conflictCommitTS=422778099492716570, key={metaKey=true, key=DB:820, field=TID:862} primary={metaKey=true, key=DB:820, field=TID:862} [try again later]"]
[2021/02/08 15:47:40.127 +08:00] [INFO] [2pc.go:1336] ["2PC clean up done"] [txnStartTS=422778099492716568]
[2021/02/08 15:47:40.305 +08:00] [INFO] [2pc.go:822] ["prewrite encounters lock"] [conn=0] [lock="key: {metaKey=true, key=DB:820, field=TID:862}, primary: {metaKey=true, key=DB:820, field=TID:862}, txnStartTS: 422778099532038211, lockForUpdateTS:0, ttl: 3001, type: Put"]
[2021/02/08 15:47:40.309 +08:00] [WARN] [txn.go:66] [RunInNewTxn] ["retry txn"=422778099532038213] ["original txn"=422778099532038213] [error="[kv:9007]Write conflict, txnStartTS=422778099532038213, conflictStartTS=422778099532038211, conflictCommitTS=422778099545145351, key={metaKey=true, key=DB:820, field=TID:862} primary={metaKey=true, key=DB:820, field=TID:862} [try again later]"]
[2021/02/08 15:47:40.311 +08:00] [INFO] [2pc.go:1336] ["2PC clean up done"] [txnStartTS=422778099532038213]
[2021/02/08 15:47:40.365 +08:00] [INFO] [2pc.go:822] ["prewrite encounters lock"] [conn=0] [lock="key: {metaKey=true, key=DB:820, field=TID:862}, primary: {metaKey=true, key=DB:820, field=TID:862}, txnStartTS: 422778099558252548, lockForUpdateTS:0, ttl: 3001, type: Put"]
[2021/02/08 15:47:40.367 +08:00] [WARN] [txn.go:66] [RunInNewTxn] ["retry txn"=422778099558252549] ["original txn"=422778099558252549] [error="[kv:9007]Write conflict, txnStartTS=422778099558252549, conflictStartTS=422778099558252548, conflictCommitTS=422778099558252551, key={metaKey=true, key=DB:820, field=TID:862} primary={metaKey=true, key=DB:820, field=TID:862} [try again later]"]
[2021/02/08 15:47:40.368 +08:00] [INFO] [2pc.go:1336] ["2PC clean up done"] [txnStartTS=422778099558252549]
[2021/02/08 15:47:41.514 +08:00] [INFO] [2pc.go:822] ["prewrite encounters lock"] [conn=0] [lock="key: {metaKey=true, key=DB:820, field=TID:862}, primary: {metaKey=true, key=DB:820, field=TID:862}, txnStartTS: 422778099859718155, lockForUpdateTS:0, ttl: 3001, type: Put"]
[2021/02/08 15:47:41.516 +08:00] [WARN] [txn.go:66] [RunInNewTxn] ["retry txn"=422778099859718158] ["original txn"=422778099859718158] [error="[kv:9007]Write conflict, txnStartTS=422778099859718158, conflictStartTS=422778099859718155, conflictCommitTS=422778099859718160, key={metaKey=true, key=DB:820, field=TID:862} primary={metaKey=true, key=DB:820, field=TID:862} [try again later]"]
[2021/02/08 15:47:41.517 +08:00] [INFO] [2pc.go:1336] ["2PC clean up done"] [txnStartTS=422778099859718158]
但是,TiKV日誌中並未發現與寫衝突相關的任何信息(幾乎都是與not leader
相關的)。參考官方文檔“樂觀事務模型下寫寫衝突問題排查”一節,同樣無法從上述日誌中定位出衝突的數據及主鍵信息(沒有tableID、indexID、handle等有效的字段)。
那麼,形如key={metaKey=true, key=DB:820, field=TID:862}
的日誌是在哪裏輸出的?既然文檔不能解決問題,那麼就直接上源碼。來到store/tikv/snapshot.go文件,部分代碼如下。
func newWriteConflictError(conflict *pb.WriteConflict) error {
var buf bytes.Buffer
prettyWriteKey(&buf, conflict.Key)
buf.WriteString(" primary=")
prettyWriteKey(&buf, conflict.Primary)
return kv.ErrWriteConflict.FastGenByArgs(conflict.StartTs, conflict.ConflictTs, conflict.ConflictCommitTs, buf.String())
}
func prettyWriteKey(buf *bytes.Buffer, key []byte) {
tableID, indexID, indexValues, err := tablecodec.DecodeIndexKey(key)
if err == nil {
_, err1 := fmt.Fprintf(buf, "{tableID=%d, indexID=%d, indexValues={", tableID, indexID)
// ...
return
}
tableID, handle, err := tablecodec.DecodeRecordKey(key)
if err == nil {
_, err3 := fmt.Fprintf(buf, "{tableID=%d, handle=%d}", tableID, handle)
// ...
return
}
mKey, mField, err := tablecodec.DecodeMetaKey(key)
if err == nil {
_, err3 := fmt.Fprintf(buf, "{metaKey=true, key=%s, field=%s}", string(mKey), string(mField))
// ...
return
}
// ...
}
可見,當產生寫衝突時,prettyWriteKey()函數會負責輸出衝突的key信息,而帶有metaKey=true
的自然是表示元數據key有衝突。從tablecodec.DecodeMetaKey()方法中並不能得到關於元數據的太多細節,繼續來到源碼meta/meta.go文件,其註釋恰好描述了元數據的結構。
Meta structure:
NextGlobalID -> int64
SchemaVersion -> int64
DBs -> {
DB:1 -> db meta data []byte
DB:2 -> db meta data []byte
}
DB:1 -> {
Table:1 -> table meta data []byte
Table:2 -> table meta data []byte
TID:1 -> int64
TID:2 -> int64
}
執行curl [tidb_addr]:10080/db-table/[TID]
命令,通過TID(等同於tableID)可以查詢到對應的表名及庫名。上述TID爲862的表是一個寫入量較大的業務表,但按照常理也不應出現如此頻繁的寫衝突,所以問題只可能出現在該表對應的元數據內部。
繼續向下看與元數據相關的字段。
var (
mMetaPrefix = []byte("m")
mNextGlobalIDKey = []byte("NextGlobalID")
mSchemaVersionKey = []byte("SchemaVersionKey")
mDBs = []byte("DBs")
mDBPrefix = "DB"
mTablePrefix = "Table"
mSequencePrefix = "SID"
mSeqCyclePrefix = "SequenceCycle"
mTableIDPrefix = "TID"
mRandomIDPrefix = "TARID"
mBootstrapKey = []byte("BootstrapKey")
mSchemaDiffPrefix = "Diff"
)
通過mTableIDPrefix、mRandomIDPrefix等字段可以推測,表元數據內維護了當前自動生成的ID。繼續查看meta/autoid/autoid.go,能夠看到自動ID的分配器(即Allocator接口的實現)有如下4種,剛好與上面的元數據定義對得上。
const (
// RowIDAllocType indicates the allocator is used to allocate row id.
RowIDAllocType AllocatorType = iota
// AutoIncrementType indicates the allocator is used to allocate auto increment value.
AutoIncrementType
// AutoRandomType indicates the allocator is used to allocate auto-shard id.
AutoRandomType
// SequenceType indicates the allocator is used to allocate sequence value.
SequenceType
)
通過自動生成ID的函數generateAutoIDByAllocType()向下追溯可知,TiDB對RowIDAllocType和AutoIncrementType的處理方式相同,也就是說行ID和自增ID都是維護在以TID
爲前綴的元數據key對應的value中。
func generateAutoIDByAllocType(m *meta.Meta, dbID, tableID, step int64, allocType AllocatorType) (int64, error) {
switch allocType {
case RowIDAllocType, AutoIncrementType:
return m.GenAutoTableID(dbID, tableID, step)
case AutoRandomType:
return m.GenAutoRandomID(dbID, tableID, step)
case SequenceType:
return m.GenSequenceValue(dbID, tableID, step)
default:
return 0, ErrInvalidAllocatorType.GenWithStackByArgs()
}
}
// GenAutoTableID adds step to the auto ID of the table and returns the sum.
func (m *Meta) GenAutoTableID(dbID, tableID, step int64) (int64, error) {
// Check if DB exists.
dbKey := m.dbKey(dbID)
if err := m.checkDBExists(dbKey); err != nil {
return 0, errors.Trace(err)
}
// Check if table exists.
tableKey := m.tableKey(tableID)
if err := m.checkTableExists(dbKey, tableKey); err != nil {
return 0, errors.Trace(err)
}
return m.txn.HInc(dbKey, m.autoTableIDKey(tableID), step)
}
func (m *Meta) autoTableIDKey(tableID int64) []byte {
return []byte(fmt.Sprintf("%s:%d", mTableIDPrefix, tableID))
}
查看TID爲862的表schema,發現其主鍵定義爲bigint(20) NOT NULL AUTO_INCREMENT
類型,所以高度懷疑是該表的自增ID引起了寫衝突。
由於DM同步任務插入數據是採用INSERT INTO VALUES(...)
語法,故來到executor/insert_common.go的insertRows()函數,它負責處理此類SQL語句。
// insertRows processes `insert|replace into values ()` or `insert|replace into set x=y`
func insertRows(ctx context.Context, base insertCommon) (err error) {
e := base.insertCommon()
// ...
e.lazyFillAutoID = true
// ...
for i, list := range e.Lists {
e.rowCount++
var row []types.Datum
row, err = evalRowFunc(ctx, list, i)
if err != nil {
return err
}
rows = append(rows, row)
if batchInsert && e.rowCount%uint64(batchSize) == 0 {
// ...
// Before batch insert, fill the batch allocated autoIDs.
rows, err = e.lazyAdjustAutoIncrementDatum(ctx, rows)
if err != nil {
return err
}
// ...
}
}
// ...
}
根據註釋,lazyAdjustAutoIncrementDatum()函數用來填充此批次內的自動ID。注意到它首先會嘗試獲取插入數據中自動ID列對應的數據,如果非空且非0,就會直接使用該ID,但同時會調用Table.RebaseAutoID()方法來根據當前ID重置自動ID的起點。RebaseAutoID()方法實際調用的是各Allocator的Rebase()方法。
func (e *InsertValues) lazyAdjustAutoIncrementDatum(ctx context.Context, rows [][]types.Datum) ([][]types.Datum, error) {
// ...
for processedIdx := 0; processedIdx < rowCount; processedIdx++ {
autoDatum := rows[processedIdx][idx]
var err error
var recordID int64
if !autoDatum.IsNull() {
recordID, err = getAutoRecordID(autoDatum, &col.FieldType, true)
if err != nil {
return nil, err
}
}
// Use the value if it's not null and not 0.
if recordID != 0 {
err = e.Table.RebaseAutoID(e.ctx, recordID, true, autoid.RowIDAllocType)
if err != nil {
return nil, err
}
e.ctx.GetSessionVars().StmtCtx.InsertID = uint64(recordID)
retryInfo.AddAutoIncrementID(recordID)
continue
}
// ...
}
// ...
}
func (alloc *allocator) Rebase(tableID, requiredBase int64, allocIDs bool) error {
if tableID == 0 {
return errInvalidTableID.GenWithStack("Invalid tableID")
}
alloc.mu.Lock()
defer alloc.mu.Unlock()
if alloc.isUnsigned {
return alloc.rebase4Unsigned(tableID, uint64(requiredBase), allocIDs)
}
return alloc.rebase4Signed(tableID, requiredBase, allocIDs)
}
不論是有符號還是無符號的rebase,都會調用kv.RunInNewTxn()方法(注意到它出現在了上文TiDB的日誌中)來啓動一個新事務來嘗試調整自動ID的區間。
func (alloc *allocator) rebase4Unsigned(tableID int64, requiredBase uint64, allocIDs bool) error {
// ...
err := kv.RunInNewTxn(context.Background(), alloc.store, true, func(ctx context.Context, txn kv.Transaction) error {
m := meta.NewMeta(txn)
currentEnd, err1 := getAutoIDByAllocType(m, alloc.dbID, tableID, alloc.allocType)
if err1 != nil {
return err1
}
uCurrentEnd := uint64(currentEnd)
if allocIDs {
newBase = mathutil.MaxUint64(uCurrentEnd, requiredBase)
newEnd = mathutil.MinUint64(math.MaxUint64-uint64(alloc.step), newBase) + uint64(alloc.step)
} else {
if uCurrentEnd >= requiredBase {
newBase = uCurrentEnd
newEnd = uCurrentEnd
return nil
}
newBase = requiredBase
newEnd = requiredBase
}
_, err1 = generateAutoIDByAllocType(m, alloc.dbID, tableID, int64(newEnd-uCurrentEnd), alloc.allocType)
return err1
})
// ...
}
推源碼推到這裏,答案已經呼之欲出了。詢問業務側對此表的寫入方式,答覆是插入數據時顯式指定了自增列的值。由於TiDB是採用分段緩存的方式維護自增ID的(詳情查看官方文檔中對AUTO_INCREMENT的解釋),顯式插入的自增ID值大概率會導致自動分配的ID區間頻繁rebase。再加上我們是採用LB組件下掛3個TiDB Server的方式作爲DM的target,多個TiDB實例之間還會爭搶自增ID的分段,使寫衝突更加嚴重。
解決問題
簡單粗暴的方法是要求業務端不要指定ID,但代價比較大,故我們嘗試去掉此表主鍵列的自增屬性。設置系統變量:
SET SESSION tidb_allow_remove_auto_inc = 1;
然後執行ALTER TABLE
語句:
ALTER TABLE warehouse_db_new.warehouse_sku
MODIFY sku_id bigint(20) NOT NULL COMMENT 'SKU ID';
執行完畢後,寫衝突明顯下降,大功告成。
The End
還有幾天就過年了,預祝大佬們春節快樂~