Mysql Seconds_Behind_Master

Seconds_Behind_Master:

先看看官方文档的说法(refman-5.7-en.a4.pdf):
This field is an indication of how "late" the slave is:
    When the slave is actively processing updates, this field shows the difference between the current timestamp on the slave and the original timestamp logged on the master for the event currently being processed on the slave.
    When no event is currently being processed on the slave, this value is 0.
    In essence, this field measures the time difference in seconds between the slave SQL thread and the slave I/O thread. 
        If the network connection between master and slave is fast, the slave I/O thread is very close to the master, so this field is a good approximation of how late the slave SQL thread is compared to the master. 
        If the network is slow, this is not a good approximation;the slave SQL thread may quite often be caught up with the slow-reading slave I/O thread, so Seconds_Behind_Master often shows a value of 0, even if the I/O thread is late compared to the master. 
        In other words, this column is useful only for fast networks.

This time difference computation works even if the master and slave do not have identical clock times, provided that the difference, computed when the slave I/O thread starts, remains constant from then on. 
Any changes—including NTP updates—can lead to clock skews that can make calculation of Seconds_Behind_Master less reliable.

In MySQL 5.7, this field is NULL (undefined or unknown) if the slave SQL thread is not running,or if the SQL thread has consumed all of the relay log and the slave I/O thread is not running. 
(In older versions of MySQL, this field was NULL if the slave SQL thread or the slave I/O thread was not running or was not connected to the master.) 
If the I/O thread is running but the relay log is exhausted, Seconds_Behind_Master is set to 0.

The value of Seconds_Behind_Master is based on the timestamps stored in events, which are preserved through replication. 
This means that if a master M1 is itself a slave of M0, any event from M1's binary log that originates from M0's binary log has M0's timestamp for that event. This enables MySQL to replicate TIMESTAMP successfully. 
However, the problem for Seconds_Behind_Master is that if M1 also receives direct updates from clients, the Seconds_Behind_Master value randomly fluctuates because sometimes the last event from M1 originates from M0 and sometimes is the result of a direct update on M1.

When using a multithreaded slave, you should keep in mind that this value is based on Exec_Master_Log_Pos, and so may not reflect the position of the most recently committed transaction.

大概意思就是:
        1.这个参数展示的是主从上event的时间戳的"差异";如果从库上没有正在应用的event,值为0
        2.本质上这个参数测量的是从库上SQL thread和I/O thread的时间差
        3.通常情况下,这个参数对于网络比较好的场景来说比较有用。
                网络好的情况下,I/O thread可以认为是接近于无延迟,那么这个参数就近似于SQL thread的延迟;
                网络不好的情况下,SQL thread就认为会经常被I/O thread"拖累",所以即使I/O thread延迟,Seconds_Behind_Master也经常是0
        4.主从时间不一样也没事,计算Seconds_Behind_Master的时候会把时间差(I/O thread启动时计算、保持固定)也考虑进去,所以任何可以改变clock的操作都会使Seconds_Behind_Master的计算有偏差
        5.MySQL 5.7开始,如果SQL thread not running或者I/O thread running且SQL thread已经应用玩所有relay log,那么Seconds_Behind_Master的值会是null
            之前版本的话,如果SQL thread not running 或者I/O thread running 或者I/O thread连不上主库的话,那么Seconds_Behind_Master的值会是null
        6.如果I/O thread在"空跑"且SQL thread已经应用玩所有relay log的话,那么Seconds_Behind_Master=0
        7.级联复制的话,中间节点的event的timestamps都会和master保持一致,所以如果节点除了复制还有其他数据写入方式的话,由于受两种写入方式的影响,Seconds_Behind_Master值的波动性可能会很大
        8.多源复制的话,Seconds_Behind_Master只能代表Exec_Master_Log_Pos这个点的event对应的主从间的延迟(意思就是只能代表"点",不能代表"面")
 

mysql主从之间数据流转的几个关键点:
        t1:主库开始执行event
        t2:主库event执行完成,传输给从库
        t3:从库接受event完成,开始执行
        t4:从库执行event完成
    
t1-->t2:主库的worker thread
t2-->t3:从库的I/O thread
t3-->t4:从库的sql thread

正常情况下我们认为的主从延迟的计算方法:
        t4(从库)-t2(主库)-m(m为主从库的系统时间差)

而Mysql关于Seconds_Behind_Master的代码如下:

if ((mi->get_master_log_pos() == mi->rli->get_group_master_log_pos()) &&
    (!strcmp(mi->get_master_log_name(), mi->rli->get_group_master_log_name())))
{
  if (mi->slave_running == MYSQL_SLAVE_RUN_CONNECT)
    protocol->store(0LL);
  else
    protocol->store_null();
}
else
{
  long time_diff= ((long)(time(0) - mi->rli->last_master_timestamp) - mi->clock_diff_with_master);
  protocol->store((longlong)(mi->rli->last_master_timestamp ? max(0L, time_diff) : 0));
}

if里面条件表示如果io thread拉取主库binlog的位置和sql thread执行的relay log相对于主库binlog的位置相等,那么认为延迟为0。
一般情况下,io thread比sql thread快。但如果网络状况特别差,导致sql thread需等待io thread的情况,那么这两个位点可能相等,会导致误认为延迟为0。

else里面的意思是
        second_behind_master=(time(0)-last_master_timestamp)-clock_diff_with_master
        
time(0)代表从库的当前时间
last_master_timestamp指的是主库执行event的时间
clock_diff_with_master就是我们上面官方文档第4点里面说的主从系统时间差

所以上面的等式,在理想情况下我们甚至可以这么化简:
        second_behind_master=(time(0)-last_master_timestamp)-clock_diff_with_master
                                                =(slave_current_time-last_master_timestamp)-(slave_current_time-master_current_time)
                                                =master_current_time-last_master_timestamp

对于last_master_timestamp
    rli->last_master_timestamp = ev->when.tv_sec + (time_t) ev->exec_time;
        其中ev->when.tv_sec表示event的开始时间,exec_time指event在主库的执行时间,但只有Query_log_event和Load_log_event才会统计exec_time。(row模式下的dml不记录exec_time)。
        也就是说last_master_timestamp的值等于event header timestamp+exec_time,如果此event没有exec_time,那么基本就是把event header timestamp更新进last_master_timestamp
        另外一种情况是sql线程在等待io线程获取binlog时,会将last_master_timestamp设为0,按上面的算法Seconds_Behind_Master为0,此时认为备库是没有延迟的。
 

另外在并行复制模式下,对于last_master_timestamp的计算规则略有不同,大家可以参考(https://yq.aliyun.com/articles/11032)


所以对于监控主从延迟,我们可以通过比对主库show master status\G里的File和Position、从库show slave status\G里的Master_Log_File和Read_Master_Log_Pos、以及Relay_Master_Log_File和Exec_Master_Log_Pos
来确认从库I/O thread和sql thread的运行情况

或者我们也可以用pt-heartbeat来进行实时监控

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章