雙主 + keepalived 是一個比較簡單的 MySQL 高可用架構,適用於中小 MySQL 集羣,今天就說說怎麼用 keepalived 做 MySQL 的高可用。
1 概述
1.1 keepalived 簡介
簡單地說,keepalived 就是通過管理 VIP 來實現機器的高可用的,在使用 keepalived 的情況下,只有一臺服務器能夠提供服務(通過 VIP 來實現),當 Master 主機宕機後,VIP 會自動飄移到另一臺服務器
keepalived 採用 Master/Slave 模式, 在 Master 上設置配置文件的 VIP,當 Master 宕機後,VIP 自動漂移到另一臺 keepalived 服務器上
keepalived 可以用來做各種軟件的高可用集羣,它會一直檢測服務器的狀態,如果有一臺服務器宕機,或工作出現故障,keepalived 將檢測到,並將有故障的服務器從系統中剔除,同時使用其他服務器代替該服務器的工作,當服務器工作正常後 keepalived 自動將服務器加入到服務器羣中。
1.2 keepalived 配合雙主
keepalived 使用默認配置只能做到主機級別的高可用,但是我們的 MySQL 要做高可用至少要增加以下功能
- 能夠檢測 MySQL 服務狀態
- 主節點
read_only=0
,備節點read_only=1
- 切換時,備節點要等待主節點同步完成
所以,keepalived 實現 MySQL 高可用需要使用自定義腳本來進行擴展
2 環境準備
2.1 數據庫環境
操作前已經準備好了一套主主架構數據庫,搭建方法參考 MySQL集羣搭建(2)-主主從模式
節點信息
IP | 系統 | 端口 | MySQL版本 | 節點 | 讀寫 | 說明 |
---|---|---|---|---|---|---|
10.0.0.247 | Centos6.5 | 3306 | 5.7.9 | Master | 讀寫 | 主節點 |
10.0.0.248 | Centos6.5 | 3306 | 5.7.9 | Standby | 只讀,可切換爲讀寫 | 備主節點 |
VIP 信息
簡稱 | VIP | 類型 |
---|---|---|
RW-VIP | 10.0.0.237 | 讀寫VIP |
Master 參考配置
[client]
port = 3306
default-character-set=utf8mb4
socket = /data/mysql_db/test_db/mysql.sock
[mysqld]
datadir = /data/mysql_db/test_db
basedir = /usr/local/mysql57
tmpdir = /tmp
socket = /data/mysql_db/test_db/mysql.sock
pid-file = /data/mysql_db/test_db/mysql.pid
skip-external-locking = 1
skip-name-resolve = 1
port = 3306
server_id = 2473306
default-storage-engine = InnoDB
character-set-server = utf8mb4
default_password_lifetime=0
auto_increment_offset = 1
auto_increment_increment = 2
#### log ####
log_timestamps=system
log_bin = /data/mysql_log/test_db/mysql-bin
log_bin_index = /data/mysql_log/test_db/mysql-bin.index
binlog_format = row
relay_log_recovery=ON
relay_log=/data/mysql_log/test_db/mysql-relay-bin
relay_log_index=/data/mysql_log/test_db/mysql-relay-bin.index
log_error = /data/mysql_log/test_db/mysql-error.log
#### replication ####
log_slave_updates = 1
replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.%
#### semi sync replication settings #####
plugin_dir=/usr/local/mysql57/lib/plugin
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1
Slave 參考配置
[client]
port = 3306
default-character-set=utf8mb4
socket = /data/mysql_db/test_db/mysql.sock
[mysqld]
datadir = /data/mysql_db/test_db
basedir = /usr/local/mysql57
tmpdir = /tmp
socket = /data/mysql_db/test_db/mysql.sock
pid-file = /data/mysql_db/test_db/mysql.pid
skip-external-locking = 1
skip-name-resolve = 1
port = 3306
server_id = 2483306
default-storage-engine = InnoDB
character-set-server = utf8mb4
default_password_lifetime=0
auto_increment_offset = 2
auto_increment_increment = 2
#### log ####
log_timestamps=system
log_bin = /data/mysql_log/test_db/mysql-bin
log_bin_index = /data/mysql_log/test_db/mysql-bin.index
binlog_format = row
relay_log_recovery=ON
relay_log=/data/mysql_log/test_db/mysql-relay-bin
relay_log_index=/data/mysql_log/test_db/mysql-relay-bin.index
log_error = /data/mysql_log/test_db/mysql-error.log
#### replication ####
log_slave_updates = 1
replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.%
#### semi sync replication settings #####
plugin_dir=/usr/local/mysql57/lib/plugin
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1
2.2 創建監控用的賬號
- 由於是測試環境,賬號密碼設置比較隨便
create user monitor@'localhost' identified by 'monitor';
grant all on *.* to monitor@'localhost';
flush privileges;
2.3 安裝 keepalived
我們在 Master 和 Slave 上部署 keepalived
1). yum 安裝
如果有對應的 yum
源,直接安裝就可以了
yum install -y keepalived
2). 源碼安裝
下載安裝包, 下載地址 keepalived, 使用 1.2.24
版本舉例
# 安裝依賴
yum install -y gcc popt-devel openssl openssl-devel libssl-dev libnl-devel popt-devel libnfnetlink-devel
# 下載包
wget http://www.keepalived.org/software/keepalived-1.2.24.tar.gz
# 解壓安裝
tar -xvz -f keepalived-1.2.24.tar.gz
cd keepalived-1.2.24
./configure --prefix=/usr/local/keepalived
make && make install
cp /usr/local/keepalived/sbin/keepalived /usr/sbin/
cp /usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/init.d/
cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/
mkdir /etc/keepalived/
cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
3 配置高可用
3.1 keepalived 配置
打開 /etc/keepalived/keepalived.conf
文件, 按照實際情況加上下面的配置
global_defs {
router_id MYSQL_MM # 標識
vrrp_skip_check_adv_addr
vrrp_strict # 嚴格執行 VRRP 協議規範
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_script check_mysql {
script "/bin/sh /etc/keepalived/keepalived_mysql_check.sh" # 檢查腳本
interval 10 # 檢查週期
}
vrrp_instance MYSQL_MM {
state BACKUP # 都設爲 BACKUP,避免起來後搶佔
interface eth0 # 網卡名稱,根據實際情況填寫
virtual_router_id 243 # 用來區分 VRRP 組播的標記,取值 0-255
priority 100
advert_int 1
nopreempt # 設爲非搶佔
authentication {
auth_type PASS
auth_pass 1111
}
# Master 節點可以註釋掉下面語句,防止啓動 keepalived 的時候執行腳本
notify_master "/bin/sh /etc/keepalived/keepalived_mysql_start.sh" # 變爲 MASTER 時執行
virtual_ipaddress {
10.0.0.237
}
# Slave 節點可以註釋下面檢查腳本,Slave 沒有必要一直檢查
track_script {
check_mysql
}
}
3.2 配置檢查腳本
打開 /etc/keepalived/keepalived_mysql_check.sh
, 寫入檢測腳本
#!/bin/sh
# @Author: chengqm
# MySQL 檢測腳本
MyPath=$(cd $(dirname $0); pwd)
cd $MyPath
ThisTime=`date '+%F %T'`
log_file='/var/log/keepalived_mysql.log'
# MySQL 連接方式,根據實際情況調整
export MYSQL_PWD='monitor'
MYSQL_USER='monitor'
MYSQL_SOCKET="/data/mysql_db/test_db/mysql.sock"
mysql_connect="mysql -u${MYSQL_USER} -S${MYSQL_SOCKET} "
# 美化輸出
function techo() {
message=$1
message_level=$2
if [ -e $message_level ];then
message_level='info'
fi
echo "`date '+%F %T'` - [${message_level}] $message" >> $log_file
}
# 檢查函數, 正常返回 0
function check {
ret=`$mysql_connect -N -e 'select 1 as value'`
if [ $? -ne 0 ] || [ $ret -ne '1' ];then
return 1
else
return 0
fi
}
function read_only {
param=$1
$mysql_connect -e "set global read_only = ${param}"
techo "設置是否只讀 read_only ${param}"
}
# 失效轉移
function failover {
techo "開始執行失效轉移"
# 1. 停止 keepalived
killall keepalived
# 2. 如果還能執行的話,設爲 read_only
read_only 1
if [ $? -eq 0 ];then
# 3. 如果還能執行,kill 所有的連接
$mysql_connect -e "select concat('KILL ',id,';') from information_schema.processlist where user!='root' AND db is not null into outfile '/tmp/kill.txt.${ThisTime}';"
if [ $? -eq 0 ];then
$mysql_connect -e "source /tmp/kill.txt.${ThisTime};"
fi
fi
# 4. 其他操作,比如說自動關機
techo "失效轉移執行成功,當前數據庫關閉訪問"
}
# 有問題檢查 4 次
for ((i=1; i<=4; i ++))
do
check
if [ $? -eq 0 ];then
techo "MySQL is ok"
# 正常退出腳本
exit 0
else
techo "Connection failed $i time(s)"
sleep 1
fi
done
techo '無法連接當前數據庫'
# 失效轉移
failover
注意:腳本沒有經過嚴格測試,需要根據實際情況調整
3.3 配置提升爲 Master 時執行的腳本
打開 /bin/sh /etc/keepalived/keepalived_mysql_start.sh"
, 寫入腳本內容
#!/bin/sh
# @Author: chengqm
# keepalived 變爲 Master 時執行
MyPath=$(cd $(dirname $0); pwd)
cd $MyPath
ThisTime=`date '+%F %T'`
log_file='/var/log/keepalived_mysql.log'
# MySQL 連接方式,根據實際情況調整
export MYSQL_PWD='monitor'
MYSQL_USER='monitor'
MYSQL_SOCKET="/data/mysql_db/test_db/mysql.sock"
mysql_connect="mysql -u${MYSQL_USER} -S${MYSQL_SOCKET} "
# 美化輸出
function techo() {
message=$1
message_level=$2
if [ -e $message_level ];then
message_level='info'
fi
echo "`date '+%F %T'` - [${message_level}] $message" >> $log_file
}
# 檢查函數, 正常返回 0
function check {
ret=`$mysql_connect -N -e 'select 1 as value'`
if [ $? -ne 0 ] || [ $ret -ne '1' ];then
return 1
else
return 0
fi
}
# 獲取 slave status 的信息
function slave_info() {
tmp_file=/tmp/slave_info.tmp
$mysql_connect -e 'show slave status\G' > /tmp/slave_info.tmp
slave_sql=`grep 'Slave_SQL_Running:' $tmp_file | sed 's/\s*//g' | tr "A-Z" "a-z" | awk -F":" '{print $2}'`
seconds_behind_master=`grep 'Seconds_Behind_Master:' $tmp_file | sed 's/\s*//g' | tr "A-Z" "a-z" | awk -F":" '{print $2}'`
master_log_file=`grep 'Master_Log_File:' $tmp_file | head -1 | sed 's/\s*//g' | tr "A-Z" "a-z" | awk -F":" '{print $2}'`
master_log_pos=`grep 'Read_Master_Log_Pos:' $tmp_file | sed 's/\s*//g' | tr "A-Z" "a-z" | awk -F":" '{print $2}'`
relay_master_log_file=`grep 'Relay_Master_Log_File:' $tmp_file | sed 's/\s*//g' | tr "A-Z" "a-z" | awk -F":" '{print $2}'`
exec_master_log_pos=`grep 'Exec_Master_Log_Pos:' $tmp_file | sed 's/\s*//g' | tr "A-Z" "a-z" | awk -F":" '{print $2}'`
}
# 設置是否可讀
function read_only {
param=$1
$mysql_connect -e "set global read_only = ${param}"
techo "設置是否只讀 read_only ${param}"
}
# 處理數據同步
function sync_master_log() {
# 如果是數據一致性優先,等待同步完畢。如果是服務可用性優先,可以註銷下面的代碼
slave_info
if [ $slave_sql == "yes" ];then
techo "當前同步位置 Master ${master_log_file} ${master_log_pos}"
techo "等待同步到 Master ${master_log_file} ${master_log_pos}"
$mysql_connect -e "select master_pos_wait('$master_log_file', $master_log_pos);" > /dev/null
techo "同步完畢"
fi
}
techo "當前數據庫提升爲主庫"
check
if [ $? -ne 0 ];then
techo "無法連接當前數據庫"
exit 1
fi
# 等待同步
sync_master_log
# 設爲可寫
read_only 0
注意:腳本沒有經過嚴格測試,需要根據實際情況調整
3.4 啓動 keepalived
由於配置了 BACKUP 模式,所以兩個 keepalived 先起來的是主,先後在主備節點執行
/etc/init.d/keepalived start
檢查 /var/log/message
日誌,確認 keepalived 沒有報錯
檢查 Master IP 狀態, 確認設置了 VIP
[root@cluster01 shell]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether fa:16:3e:de:80:33 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.247/16 brd 10.0.255.255 scope global eth0
inet 10.0.0.237/32 scope global eth0
inet6 fe80::f816:3eff:fede:8033/64 scope link
valid_lft forever preferred_lft forever
檢查 MySQL 檢測腳本執行情況,確認正常運行
[root@cluster01 ~]# tail -f /var/log/keepalived_mysql.log
...
2019-01-28 15:04:18 - [info] MySQL is ok
2019-01-28 15:04:28 - [info] MySQL is ok
4 失效轉移測試
在 mytest
庫裏新建 nowdate
測試表,只有 id
和 ctime
字段,然後每秒插入一條數據
[root@cluster03 ~]# while true; do date;mysql -h10.0.0.237 -P3306 -umytest -e 'use mytest;insert into nowdate values (null, now());'; sleep 1;done
Mon Jan 28 15:04:26 CST 2019
Mon Jan 28 15:04:27 CST 2019
...
kill 掉 Master 進程
killall mysqld
查看舊 Master 日誌
2019-01-28 15:04:48 - [info] MySQL is ok
2019-01-28 15:04:58 - [info] Connection failed 1 time(s)
2019-01-28 15:04:59 - [info] Connection failed 2 time(s)
2019-01-28 15:05:00 - [info] Connection failed 3 time(s)
2019-01-28 15:05:01 - [info] Connection failed 4 time(s)
2019-01-28 15:05:02 - [info] 無法連接當前數據庫
2019-01-28 15:05:02 - [info] 開始執行失效轉移
2019-01-28 15:05:02 - [info] 設置是否只讀 read_only 1
2019-01-28 15:05:02 - [info] 失效轉移執行成功,當前數據庫關閉訪問
查看新 Master 日誌
2019-01-28 15:05:04 - [info] 當前數據庫提升爲主庫
2019-01-28 15:05:04 - [info] 當前同步位置 Master mysql-bin.000015 32338
2019-01-28 15:05:04 - [info] 等待同步到 Master mysql-bin.000015 32338
2019-01-28 15:05:04 - [info] 同步完畢
2019-01-28 15:05:04 - [info] 設置是否只讀 read_only 0
2019-01-28 15:05:05 - [info] MySQL is ok
查看新 Master IP,確認 VIP 已經飄過來了
[root@cluster02 ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether fa:16:3e:66:7e:e8 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.248/16 brd 10.0.255.255 scope global eth0
inet 10.0.0.237/32 scope global eth0
inet6 fe80::f816:3eff:fe66:7ee8/64 scope link
valid_lft forever preferred_lft forever
查看插入數據執行情況,大概有 12 秒是不可用的
Mon Jan 28 15:04:51 CST 2019
ERROR 2003 (HY000): Can't connect to MySQL server on '10.0.0.237' (111)
Mon Jan 28 15:04:52 CST 2019
ERROR 2003 (HY000): Can't connect to MySQL server on '10.0.0.237' (111)
Mon Jan 28 15:04:53 CST 2019
ERROR 2003 (HY000): Can't connect to MySQL server on '10.0.0.237' (111)
Mon Jan 28 15:04:54 CST 2019
ERROR 2003 (HY000): Can't connect to MySQL server on '10.0.0.237' (111)
Mon Jan 28 15:04:55 CST 2019
ERROR 2003 (HY000): Can't connect to MySQL server on '10.0.0.237' (111)
Mon Jan 28 15:04:56 CST 2019
ERROR 2003 (HY000): Can't connect to MySQL server on '10.0.0.237' (111)
Mon Jan 28 15:04:57 CST 2019
ERROR 2003 (HY000): Can't connect to MySQL server on '10.0.0.237' (111)
Mon Jan 28 15:04:58 CST 2019
ERROR 2003 (HY000): Can't connect to MySQL server on '10.0.0.237' (111)
Mon Jan 28 15:05:00 CST 2019
ERROR 2003 (HY000): Can't connect to MySQL server on '10.0.0.237' (111)
Mon Jan 28 15:05:01 CST 2019
ERROR 2003 (HY000): Can't connect to MySQL server on '10.0.0.237' (111)
Mon Jan 28 15:05:02 CST 2019
ERROR 2003 (HY000): Can't connect to MySQL server on '10.0.0.237' (111)
Mon Jan 28 15:05:03 CST 2019
失效切換成功
5 總結
使用雙主 + keepalived 的優點是部署簡單,雙主加半同步情況下,理論上不會丟數據,適用於中小型 MySQL 集羣。缺點也比較明顯,就是增加從節點的情況下,從節點不會主動切換同步對象,而且腳本需要自己實現,有一定風險。