Mysql Seconds_Behind_Master

Seconds_Behind_Master:

先看看官方文檔的說法(refman-5.7-en.a4.pdf):
This field is an indication of how "late" the slave is:
    When the slave is actively processing updates, this field shows the difference between the current timestamp on the slave and the original timestamp logged on the master for the event currently being processed on the slave.
    When no event is currently being processed on the slave, this value is 0.
    In essence, this field measures the time difference in seconds between the slave SQL thread and the slave I/O thread. 
        If the network connection between master and slave is fast, the slave I/O thread is very close to the master, so this field is a good approximation of how late the slave SQL thread is compared to the master. 
        If the network is slow, this is not a good approximation;the slave SQL thread may quite often be caught up with the slow-reading slave I/O thread, so Seconds_Behind_Master often shows a value of 0, even if the I/O thread is late compared to the master. 
        In other words, this column is useful only for fast networks.

This time difference computation works even if the master and slave do not have identical clock times, provided that the difference, computed when the slave I/O thread starts, remains constant from then on. 
Any changes—including NTP updates—can lead to clock skews that can make calculation of Seconds_Behind_Master less reliable.

In MySQL 5.7, this field is NULL (undefined or unknown) if the slave SQL thread is not running,or if the SQL thread has consumed all of the relay log and the slave I/O thread is not running. 
(In older versions of MySQL, this field was NULL if the slave SQL thread or the slave I/O thread was not running or was not connected to the master.) 
If the I/O thread is running but the relay log is exhausted, Seconds_Behind_Master is set to 0.

The value of Seconds_Behind_Master is based on the timestamps stored in events, which are preserved through replication. 
This means that if a master M1 is itself a slave of M0, any event from M1's binary log that originates from M0's binary log has M0's timestamp for that event. This enables MySQL to replicate TIMESTAMP successfully. 
However, the problem for Seconds_Behind_Master is that if M1 also receives direct updates from clients, the Seconds_Behind_Master value randomly fluctuates because sometimes the last event from M1 originates from M0 and sometimes is the result of a direct update on M1.

When using a multithreaded slave, you should keep in mind that this value is based on Exec_Master_Log_Pos, and so may not reflect the position of the most recently committed transaction.

大概意思就是:
        1.這個參數展示的是主從上event的時間戳的"差異";如果從庫上沒有正在應用的event,值爲0
        2.本質上這個參數測量的是從庫上SQL thread和I/O thread的時間差
        3.通常情況下,這個參數對於網絡比較好的場景來說比較有用。
                網絡好的情況下,I/O thread可以認爲是接近於無延遲,那麼這個參數就近似於SQL thread的延遲;
                網絡不好的情況下,SQL thread就認爲會經常被I/O thread"拖累",所以即使I/O thread延遲,Seconds_Behind_Master也經常是0
        4.主從時間不一樣也沒事,計算Seconds_Behind_Master的時候會把時間差(I/O thread啓動時計算、保持固定)也考慮進去,所以任何可以改變clock的操作都會使Seconds_Behind_Master的計算有偏差
        5.MySQL 5.7開始,如果SQL thread not running或者I/O thread running且SQL thread已經應用玩所有relay log,那麼Seconds_Behind_Master的值會是null
            之前版本的話,如果SQL thread not running 或者I/O thread running 或者I/O thread連不上主庫的話,那麼Seconds_Behind_Master的值會是null
        6.如果I/O thread在"空跑"且SQL thread已經應用玩所有relay log的話,那麼Seconds_Behind_Master=0
        7.級聯複製的話,中間節點的event的timestamps都會和master保持一致,所以如果節點除了複製還有其他數據寫入方式的話,由於受兩種寫入方式的影響,Seconds_Behind_Master值的波動性可能會很大
        8.多源複製的話,Seconds_Behind_Master只能代表Exec_Master_Log_Pos這個點的event對應的主從間的延遲(意思就是隻能代表"點",不能代表"面")
 

mysql主從之間數據流轉的幾個關鍵點:
        t1:主庫開始執行event
        t2:主庫event執行完成,傳輸給從庫
        t3:從庫接受event完成,開始執行
        t4:從庫執行event完成
    
t1-->t2:主庫的worker thread
t2-->t3:從庫的I/O thread
t3-->t4:從庫的sql thread

正常情況下我們認爲的主從延遲的計算方法:
        t4(從庫)-t2(主庫)-m(m爲主從庫的系統時間差)

而Mysql關於Seconds_Behind_Master的代碼如下:

if ((mi->get_master_log_pos() == mi->rli->get_group_master_log_pos()) &&
    (!strcmp(mi->get_master_log_name(), mi->rli->get_group_master_log_name())))
{
  if (mi->slave_running == MYSQL_SLAVE_RUN_CONNECT)
    protocol->store(0LL);
  else
    protocol->store_null();
}
else
{
  long time_diff= ((long)(time(0) - mi->rli->last_master_timestamp) - mi->clock_diff_with_master);
  protocol->store((longlong)(mi->rli->last_master_timestamp ? max(0L, time_diff) : 0));
}

if裏面條件表示如果io thread拉取主庫binlog的位置和sql thread執行的relay log相對於主庫binlog的位置相等,那麼認爲延遲爲0。
一般情況下,io thread比sql thread快。但如果網絡狀況特別差,導致sql thread需等待io thread的情況,那麼這兩個位點可能相等,會導致誤認爲延遲爲0。

else裏面的意思是
        second_behind_master=(time(0)-last_master_timestamp)-clock_diff_with_master
        
time(0)代表從庫的當前時間
last_master_timestamp指的是主庫執行event的時間
clock_diff_with_master就是我們上面官方文檔第4點裏面說的主從系統時間差

所以上面的等式,在理想情況下我們甚至可以這麼化簡:
        second_behind_master=(time(0)-last_master_timestamp)-clock_diff_with_master
                                                =(slave_current_time-last_master_timestamp)-(slave_current_time-master_current_time)
                                                =master_current_time-last_master_timestamp

對於last_master_timestamp
    rli->last_master_timestamp = ev->when.tv_sec + (time_t) ev->exec_time;
        其中ev->when.tv_sec表示event的開始時間,exec_time指event在主庫的執行時間,但只有Query_log_event和Load_log_event纔會統計exec_time。(row模式下的dml不記錄exec_time)。
        也就是說last_master_timestamp的值等於event header timestamp+exec_time,如果此event沒有exec_time,那麼基本就是把event header timestamp更新進last_master_timestamp
        另外一種情況是sql線程在等待io線程獲取binlog時,會將last_master_timestamp設爲0,按上面的算法Seconds_Behind_Master爲0,此時認爲備庫是沒有延遲的。
 

另外在並行複製模式下,對於last_master_timestamp的計算規則略有不同,大家可以參考(https://yq.aliyun.com/articles/11032)


所以對於監控主從延遲,我們可以通過比對主庫show master status\G裏的File和Position、從庫show slave status\G裏的Master_Log_File和Read_Master_Log_Pos、以及Relay_Master_Log_File和Exec_Master_Log_Pos
來確認從庫I/O thread和sql thread的運行情況

或者我們也可以用pt-heartbeat來進行實時監控

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章