Canal 代碼 - EventParser

git 地址

https://github.com/alibaba/canal

架構

(圖片來自:https://www.jianshu.com/p/0ccbd1a1a5ec

代碼

  • Demo

com/alibaba/otter/canal/parse/MysqlBinlogDumpPerformanceTest.java

  • start

// com/alibaba/otter/canal/parse/inbound/AbstractEventParser.java
public void start() {
        super.start();
        MDC.put("destination", destination);
        // 配置transaction buffer
        // 初始化緩衝隊列
        transactionBuffer.setBufferSize(transactionSize);// 設置buffer大小
        transactionBuffer.start();
        // 構造bin log parser
        binlogParser = buildParser();// 初始化一下BinLogParser
        binlogParser.start();
        // 啓動工作線程
        parseThread = new Thread(new Runnable() {

            public void run() {
                MDC.put("destination", String.valueOf(destination));
                ErosaConnection erosaConnection = null;
                while (running) {
                    try {
                        // 開始執行replication
                        // 1. 構造Erosa連接,對於 Mysql 而言,就是連接 Mysql
                        erosaConnection = buildErosaConnection();

                        // 2. 啓動一個心跳線程,定期往 EventSink 傳輸數據,
                        // 保證下游能收到數據(在上游 mysql 無 binlog 時),能知道 Canal server 自身是否工作正常。
                        startHeartBeat(erosaConnection);

                        // 3. 執行 dump 前的準備工作,包括如下:
                        // 檢測 BinlogFormat,BinlogImage
                        // show variables like 'binlog_format',參考:https://www.cnblogs.com/xingyunfashi/p/8431780.html
                        // show variables like 'binlog_row_image'  ->  參考:https://blog.csdn.net/actiontech/article/details/81701362
                        preDump(erosaConnection);

                        // 連接源 mysql
                        erosaConnection.connect();

                        // show variables like 'server_id';
                        long queryServerId = erosaConnection.queryServerId();
                        if (queryServerId != 0) {
                            serverId = queryServerId;
                        }

                        // 4. 獲取起始位點
                        long start = System.currentTimeMillis();
                        logger.warn("---> begin to find start position, it will be long time for reset or first position");
                        EntryPosition position = findStartPosition(erosaConnection);
                        final EntryPosition startPosition = position;
                        if (startPosition == null) {
                            throw new PositionNotFoundException("can't find start position for " + destination);
                        }

                        ........

                        // 重新鏈接,因爲在找position過程中可能有狀態,需要斷開後重建
                        erosaConnection.reconnect();

                        final SinkFunction sinkHandler = new SinkFunction<EVENT>() {

                            private LogPosition lastPosition;

                            public boolean sink(EVENT event) {
                                try {
                                    CanalEntry.Entry entry = parseAndProfilingIfNecessary(event, false);

                                    if (!running) {
                                        return false;
                                    }

                                    if (entry != null) {
                                        exception = null; // 有正常數據流過,清空exception
                                        transactionBuffer.add(entry);
                                        // 記錄一下對應的positions
                                        this.lastPosition = buildLastPosition(entry);
                                        // 記錄一下最後一次有數據的時間
                                        lastEntryTime = System.currentTimeMillis();
                                    }
                                    return running;
                                } catch (TableIdNotFoundException e) {
                                    throw e;
                                } catch (Throwable e) {
                                    if (e.getCause() instanceof TableIdNotFoundException) {
                                        throw (TableIdNotFoundException) e.getCause();
                                    }
                                    // 記錄一下,出錯的位點信息
                                    processSinkError(e,
                                        this.lastPosition,
                                        startPosition.getJournalName(),
                                        startPosition.getPosition());
                                    throw new CanalParseException(e); // 繼續拋出異常,讓上層統一感知
                                }
                            }
                        };

                        // 4. 開始dump數據
                        if (parallel) {
                            .......
                            // build stage processor
                            multiStageCoprocessor = buildMultiStageCoprocessor();
                            erosaConnection.dump(startPosition.getJournalName(),
                                        startPosition.getPosition(),
                                        multiStageCoprocessor);
                       } else {
                           erosaConnection.dump(startPosition.getJournalName(),
                                        startPosition.getPosition(),
                                        sinkHandler);
                       }
                       
    }
  • heartbeat

心跳:保證下游能收到數據(在上游 mysql 無 binlog 時),能知道 Canal server 自身是否工作正常。

 

// com/alibaba/otter/canal/parse/inbound/AbstractEventParser.java
    protected TimerTask buildHeartBeatTimeTask(ErosaConnection connection) {
        return new TimerTask() {

            public void run() {
                try {
                    if (exception == null || lastEntryTime > 0) {
                        // 如果未出現異常,或者有第一條正常數據
                        long now = System.currentTimeMillis();
                        long inteval = (now - lastEntryTime) / 1000;
                        if (inteval >= detectingIntervalInSeconds) {
                            Header.Builder headerBuilder = Header.newBuilder();
                            headerBuilder.setExecuteTime(now);
                            Entry.Builder entryBuilder = Entry.newBuilder();
                            entryBuilder.setHeader(headerBuilder.build());
                            entryBuilder.setEntryType(EntryType.HEARTBEAT);
                            Entry entry = entryBuilder.build();
                            // 提交到sink中,目前不會提交到store中,會在sink中進行忽略
                            consumeTheEventAndProfilingIfNecessary(Arrays.asList(entry));
                        }
                    }

                } catch (Throwable e) {
                    logger.warn("heartBeat run failed ", e);
                }
            }

        };
    }

        protected boolean consumeTheEventAndProfilingIfNecessary(List<CanalEntry.Entry> entrys) throws CanalSinkException,
                                                                                           InterruptedException {
        ......
        boolean result = eventSink.sink(entrys, (runningInfo == null) ? null : runningInfo.getAddress(), destination);
        ......
        return result;
    }
  • findStartPosition

獲取 start position 位置有 4 種,按優先順序

  • 從 logPositionManager 獲取
  • 構造 EventParser 時指定 binlog 時指定 binlogName + offset
  • 構造 EventParser 時指定 timestamp
  • 都沒指定,獲取 mysql 最新 binlog 位點 (show master status)

 

// com/alibaba/otter/canal/parse/inbound/mysql/MysqlEventParser.java
protected EntryPosition findStartPositionInternal(ErosaConnection connection) {
        MysqlConnection mysqlConnection = (MysqlConnection) connection;
        // 通過 logPositionManager 獲取歷史位點,用於 canal server 重啓或者切換到其他 server 時,根據當前消費到的位置,斷點續傳
        LogPosition logPosition = logPositionManager.getLatestIndexBy(destination);
        if (logPosition == null) { // 找不到歷史成功記錄
            EntryPosition entryPosition = null;
            // 如果構造 EventParser 時就已指定 position,優先使用
            if (masterInfo != null && mysqlConnection.getConnector().getAddress().equals(masterInfo.getAddress())) {
                entryPosition = masterPosition;
            } else if (standbyInfo != null
                       && mysqlConnection.getConnector().getAddress().equals(standbyInfo.getAddress())) {
                entryPosition = standbyPosition;
            }

            if (entryPosition == null) {
                // 通過 show master status 獲取 mysql 最新 binlog 位點,如:
                // EntryPosition[journalName=binlog.000005, position=20973]
                entryPosition = findEndPositionWithMasterIdAndTimestamp(mysqlConnection); // 默認從當前最後一個位置進行消費
            }

            // 判斷一下是否需要按時間訂閱
            if (StringUtils.isEmpty(entryPosition.getJournalName())) {
                // 如果沒有指定 binlogName,但指定了 timestamp, 嘗試按照 timestamp 進行查找
                if (entryPosition.getTimestamp() != null && entryPosition.getTimestamp() > 0L) {
                    logger.warn("prepare to find start position {}:{}:{}",
                        new Object[] { "", "", entryPosition.getTimestamp() });
                    return findByStartTimeStamp(mysqlConnection, entryPosition.getTimestamp());
                } else {
                    // 如果都沒有,默認從 mysql 最新 binlog 位點進行消費
                    logger.warn("prepare to find start position just show master status");
                    return findEndPositionWithMasterIdAndTimestamp(mysqlConnection);
                }
            }
  • 位點類 EntryPosition

最主要兩個字段:
journalName: binlog 文件名
position:offset  
timestamp: 位點對應的 timestamp。
binlog 文件名 + offset 唯一準確確定位點。timestamp 是 canal 自身爲了實現位點時間展示及按時間進行 binlog 回溯而添加的。

public EntryPosition(String journalName, Long position){
        this(journalName, position, null);
    }
  • 獲取當前 server 最後一條 binlog 的位置

[email protected] [test] > show master status;
+---------------+----------+--------------+------------------+-------------------+
| File          | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+---------------+----------+--------------+------------------+-------------------+
| binlog.000005 |    20973 |              |                  |                   |
+---------------+----------+--------------+------------------+-------------------+
  • 獲取當前 server 第一條 binlog 的位置

[email protected] [test] > show binlog events limit 1;
+---------------+-----+-------------+-----------+-------------+---------------------------------------+
| Log_name      | Pos | Event_type  | Server_id | End_log_pos | Info                                  |
+---------------+-----+-------------+-----------+-------------+---------------------------------------+
| binlog.000001 |   4 | Format_desc |         1 |         123 | Server ver: 5.7.20-log, Binlog ver: 4 |
+---------------+-----+-------------+-----------+-------------+---------------------------------------+
  •  根據時間點查找 binlog 位點

原理很簡單:

  • 1. 找出當前 mysql 所有 binlog
  • 2. 找到 timestamp 存在於哪個 binlog
  • 3. 找到 timestamp 位於該 binlog 中哪個 offset

// com/alibaba/otter/canal/parse/inbound/mysql/MysqlEventParser.java
private EntryPosition findByStartTimeStamp(MysqlConnection mysqlConnection, Long startTimestamp)

// com/alibaba/otter/canal/parse/inbound/mysql/MysqlEventParser.java
private EntryPosition findAsPerTimestampInSpecificLogFile(MysqlConnection mysqlConnection,
                                                              final Long startTimestamp,
                                                              final EntryPosition endPosition,
                                                              final String searchBinlogFile,
                                                              final Boolean justForPositionTimestamp) 
  • binlog 位點查找失敗

 // binlog 定位位點失敗,可能有兩個原因:
 // 1. binlog位點被刪除
 // 2. vip 模式的 mysql, 發生了主備切換, 判斷一下serverId是否變化,針對這種模式可以發起一次基於時間戳查找合適的 binlog 位點

// com/alibaba/otter/canal/parse/inbound/mysql/MysqlEventParser.java
if (logPosition.getIdentity().getSourceAddress().equals(mysqlConnection.getConnector().getAddress())) {
                if (dumpErrorCountThreshold >= 0 && dumpErrorCount > dumpErrorCountThreshold) {
                    // binlog 定位位點失敗,可能有兩個原因:
                    // 1. binlog位點被刪除
                    // 2. vip模式的mysql,發生了主備切換,判斷一下serverId是否變化,
                    // 針對這種模式可以發起一次基於時間戳查找合適的 binlog 位點
                    boolean case2 = (standbyInfo == null || standbyInfo.getAddress() == null)
                                    && logPosition.getPostion().getServerId() != null
                                    && !logPosition.getPostion().getServerId().equals(findServerId(mysqlConnection));
                    if (case2) {
                        long timestamp = logPosition.getPostion().getTimestamp();
                        long newStartTimestamp = timestamp - fallbackIntervalInSeconds * 1000;
                        logger.warn("prepare to find start position by last position {}:{}:{}", new Object[] { "", "",
                                logPosition.getPostion().getTimestamp() });
                        EntryPosition findPosition = findByStartTimeStamp(mysqlConnection, newStartTimestamp);
                        // 重新置爲一下
                        dumpErrorCount = 0;
                        return findPosition;
                    }

                    Long timestamp = logPosition.getPostion().getTimestamp();
                    if (isRdsOssMode() && (timestamp != null && timestamp > 0)) {
                        // 如果binlog位點不存在,並且屬於timestamp不爲空,可以返回null走到oss binlog處理
                        return null;
                    }
                }
                // 其餘情況
                logger.warn("prepare to find start position just last position\n {}",
                    JsonUtils.marshalToString(logPosition));
                return logPosition.getPostion();

dump

  • dump 分爲3部分:fetcher + decoder + sinker
  • fetcher : 逐條接收來自 mysql 的 binlog 數據
  • decoder : 將獲取到的 binlog 轉換爲 LogEvent:com/taobao/tddl/dbsync/binlog/LogEvent.java
  • sinker : 將 logEvent 放入 transactionBuffer,transactionBuffer 會等到一個事務的所有 logEvent 都放入後,才調用 flush 將數據送給 EventSink。一個事務包括 start, 具體 sql,end。fetcher 和 decoder 以一條記錄爲單位,但 sinker 以事務爲單位。
// com/alibaba/otter/canal/parse/inbound/mysql/MysqlConnection.java    
public void dump(String binlogfilename, Long binlogPosition, SinkFunction func) throws IOException {
        updateSettings();
        loadBinlogChecksum();
        sendRegisterSlave();
        // 發送 COM_BINLOG_DUMP 給 mysql,需指定起始 binlogName 和 offset。參考:https://dev.mysql.com/doc/internals/en/com-binlog-dump.html
        sendBinlogDump(binlogfilename, binlogPosition);
        // fetcher 並不需要自己從 binlog 文件中 pull 數據,而是 mysql 的 dump thread 不斷 push binlog。
        // 這也是 mysql 主備同步的原理:備庫通過IO Thread向主庫發起讀取binlog的請求(COM_BINLOG_DUMP命令),
        // 主庫收到COM_BINLOG_DUMP請求後,使用單獨線程(dump thread)不斷向備庫IO Thread發送Binlog。
        // 參考:http://www.orczhou.com/index.php/2011/11/how-mysql-send-the-binary-log/
        DirectLogFetcher fetcher = new DirectLogFetcher(connector.getReceiveBufferSize());
        fetcher.start(connector.getChannel());
        LogDecoder decoder = new LogDecoder(LogEvent.UNKNOWN_EVENT, LogEvent.ENUM_END_EVENT);
        LogContext context = new LogContext();
        context.setFormatDescription(new FormatDescriptionLogEvent(4, binlogChecksum));
        while (fetcher.fetch()) {
            accumulateReceivedBytes(fetcher.limit());
            LogEvent event = null;
            event = decoder.decode(fetcher, context);

            if (event == null) {
                throw new CanalParseException("parse failed");
            }

            if (!func.sink(event)) {
                break;
            }

            if (event.getSemival() == 1) {
                sendSemiAck(context.getLogPosition().getFileName(), context.getLogPosition().getPosition());
            }
        }
    }

decoder

com/taobao/tddl/dbsync/binlog/LogDecoder.java
/**
     * Decoding an event from binary-log buffer.
     *
     * @return <code>UknownLogEvent</code> if event type is unknown or skipped,
     * <code>null</code> if buffer is not including a full event.
     */
    public LogEvent decode(LogBuffer buffer, LogContext context) throws IOException {
        final int limit = buffer.limit();

        if (limit >= FormatDescriptionLogEvent.LOG_EVENT_HEADER_LEN) {
            LogHeader header = new LogHeader(buffer, context.getFormatDescription());

            final int len = header.getEventLen();
            if (limit >= len) {
                LogEvent event;

                /* Checking binary-log's header */
                if (handleSet.get(header.getType())) {
                    buffer.limit(len);
                    try {
                        /* Decoding binary-log to event */
                        event = decode(buffer, header, context);
                    } catch (IOException e) {
                        if (logger.isWarnEnabled()) {
                            logger.warn("Decoding " + LogEvent.getTypeName(header.getType()) + " failed from: "
                                        + context.getLogPosition(), e);
                        }
                        throw e;
                    } finally {
                        buffer.limit(limit); /* Restore limit */
                    }
                } else {
                    /* Ignore unsupported binary-log. */
                    event = new UnknownLogEvent(header);
                }

                if (event != null) {
                    // set logFileName
                    event.getHeader().setLogFileName(context.getLogPosition().getFileName());
                    event.setSemival(buffer.semival);
                }

                /* consume this binary-log. */
                buffer.consume(len);
                return event;
            }
        }

        /* Rewind buffer's position to 0. */
        buffer.rewind();
        return null;
    }

sinker

// com/alibaba/otter/canal/parse/inbound/AbstractEventParser.java
final SinkFunction sinkHandler = new SinkFunction<EVENT>() {

                            private LogPosition lastPosition;

                            public boolean sink(EVENT event) {
                                try {
                                    CanalEntry.Entry entry = parseAndProfilingIfNecessary(event, false);

                                    if (!running) {
                                        return false;
                                    }

                                    if (entry != null) {
                                        exception = null; // 有正常數據流過,清空exception
                                        transactionBuffer.add(entry);
                                        // 記錄一下對應的positions
                                        this.lastPosition = buildLastPosition(entry);
                                        // 記錄一下最後一次有數據的時間
                                        lastEntryTime = System.currentTimeMillis();
                                    }
                                    return running;
                                } catch (TableIdNotFoundException e) {
                                    throw e;
                                } catch (Throwable e) {
                                    if (e.getCause() instanceof TableIdNotFoundException) {
                                        throw (TableIdNotFoundException) e.getCause();
                                    }
                                    // 記錄一下,出錯的位點信息
                                    processSinkError(e,
                                        this.lastPosition,
                                        startPosition.getJournalName(),
                                        startPosition.getPosition());
                                    throw new CanalParseException(e); // 繼續拋出異常,讓上層統一感知
                                }
                            }

                        };
// com/alibaba/otter/canal/parse/inbound/AbstractEventParser.java
transactionBuffer = new EventTransactionBuffer(new TransactionFlushCallback() {
            // 一個事務包括 start, 具體 sql,end
            // fetcher 和 decoder 以一條記錄爲單位
            // 但 sinker 以事務爲單位
            public void flush(List<CanalEntry.Entry> transaction) throws InterruptedException {
                // 內部調用 eventSink.sink
                boolean successed = consumeTheEventAndProfilingIfNecessary(transaction);
                if (!running) {
                    return;
                }

                if (!successed) {
                    throw new CanalParseException("consume failed!");
                }

                LogPosition position = buildLastTransactionPosition(transaction);
                if (position != null) { // 可能position爲空
                    logPositionManager.persistLogPosition(AbstractEventParser.this.destination, position);
                }
            }
        });

主備切換 Failover

延伸1:Mysql 如何傳輸二進制日誌

轉自:http://www.orczhou.com/index.php/2011/11/how-mysql-send-the-binary-log/

MySQL如何傳輸二進制日誌,是主庫推,還是備庫拉?MySQL日誌傳輸的實時性如何?”。

在MySQL Replication結構中,備庫端初次通過CHANGE MASTER TO完成Replication配置,再使用start slave命令開始複製。更細緻的,備庫通過IO Thread向主庫發起讀取binlog的請求(COM_BINLOG_DUMP命令),主庫收到COM_BINLOG_DUMP請求後,使用單獨線程(dump thread)不斷向備庫IO Thread發送Binlog。示意圖(大圖):

在主庫端一旦有新的日誌產生後,立刻會發送一次廣播,dump線程在收到廣播後,則會讀取二進制日誌並通過網絡向備庫傳輸日誌,所以這是一個主庫向備庫不斷推送的過程;

新日誌在產生後,只需一次廣播和網絡就會立刻(<1ms)向發送到備庫,如果主備之間網絡較好的話(例如RTT<1ms),備庫端的日誌也就小於2ms了。所以,一般的(依賴於RTT),備庫的實時性都非常好。 參考: 1. MySQL Replication Manual

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

發佈了40 篇原創文章 · 獲贊 1 · 訪問量 3423
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章