Mysql MHA配置文檔
Mysql MHA配置文檔:
環境說明:
Master:10.100.251.221:3306
Slave1:10.100.251.222:3306 (候選master)
Slave2:10.100.251.223:3306 (+MHA Manager)
vip:10.100.251.228
1、配置主從同步
一、主服務器
1.1、創建一個複製用戶,具有replication slave 權限。
mysql> grant replication slave on *.* to 'slave001'@'%' identified by 'slave001';
1.2、編輯my.cnf文件
vi /etc/my.cnf
添加:
server-id=221
1.3、並開啓log-bin二進制日誌文件
log-bin=mysql-bin
1.4、啓動mysql數據庫
service mysql start
1.5、得到binlog日誌文件名和偏移量
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000007 | 551 | | | |
+------------------+----------+--------------+------------------+-------------------+
1.6、備份要同步的數據庫
mysqldump -uroot -p --lock-tables --events --triggers --routines --flush-logs --master-data=2 --databases repdb > /tmp/backup/db.sql
或:
mysqldump --master-data=2 --single-transaction --events -R --flush-logs --triggers --databases repdb > all.sql
1.7、查看position
[root@mytest01 backup]# grep MASTER /tmp/backup/db.sql
-- CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000009', MASTER_LOG_POS=120;
[root@mytest01 data]#
二、從服務器
2.0、創建一個複製用戶,具有replication slave 權限。
mysql> grant replication slave on *.* to 'slave001'@'%' identified by 'slave001';
2.1、編輯my.cnf文件
vi /etc/my.cnf
添加
server-id=222
log-bin = mysql-bin
log-bin-index = mysql-bin.index
read_only=1
relay_log_purge=0 #(一主一叢不需要此項,兩從及以上建議開次參數,防止切換爲成主庫的從庫自動刪除中繼日誌後,無法給其他從庫應用這部分日誌)
注:需要把默認的server-id=1去掉
不要嘗試把master配置屬性寫在my.cnf 中,5.1.7以後,mysql已經不支持這樣做了
2.2、啓動從數據庫
service mysql start
2.3、把生產的數據導進從服務器:
mysql -uroot -proot123</tmp/backup/db.sql
2.3、對從數據庫進行相應設置
mysql> change master to
master_host='10.100.251.221',
master_user='slave001',
master_password='slave001',
master_log_file='mysql-bin.000009',
master_log_pos=120;
2.4、啓動從服務器slave線程
mysql>start slave;
執行show processlist命令顯示以下進程:
mysql>show processlist\G
*************************** 2. row ***************************
Id: 2
User: system user
Host:
db: NULL
Command: Connect
Time: 2579
State: Has read all relay log; waiting for the slave I/O thread to update it
Info: NULL表示slave已經連接上master,開始接受並執行日誌
2.5、查看slave線程狀態
mysql>show slave status;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.100.251.221
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.0000010
Read_Master_Log_Pos: 106
Relay_Log_File: centos-relay-bin.000002
Relay_Log_Pos: 529
Relay_Master_Log_File: mysql-bin.0000010
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 106
Relay_Log_Space: 830
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
1 row in set (0.00 sec)
驗證是否配置正確
在從服務器上執行
mysql> show slave status\G
Waiting for master to send event
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
如以上二行同時爲Yes 說明配置成功
PS:show slave status\G 後不要添加 ; 號, 不然會出 ERROR
2、配置ssh公鑰互信
一、本例中manager節點和node節點ip
manager:10.100.251.223
node1:10.100.251.221
node2:10.100.251.222
node3:10.100.251.223
注:manager節點可以安裝獨立的服務器上,本例爲了節省機器,manager安裝在了主庫(10.100.251.223)上.
二. 配置manager和node各節點間的root用戶的ssh公鑰互信
配置ssh免密碼連入
注意要以root用戶登錄,在root用戶的主目錄下進行操作。--一臺服務器上操作就可以
# ssh-keygen
# ssh-copy-id -p 22 [email protected]
# ssh-copy-id -p 22 [email protected]
# ssh-copy-id -p 22 [email protected]
3、安裝 MHA 包
一. 下載MHA安裝包
下載網址:
https://code.google.com/p/mysql-master-ha/wiki/Downloads?tm=2
下載rpm包或tar all均可,建議用rpm包,因爲安裝簡單。
二. 安裝 MHA Node
在manager和node的所有節點均需安裝MHA Node。
rpm安裝方式:
# yum install perl-DBD-MySQL
# rpm -ivh mha4mysql-node-0.56-0.el5.noarch.rpm
如果有依賴:
[[email protected] ~]# cat install.sh
#!/bin/bash
wget http://xrl.us/cpanm --no-check-certificate
mv cpanm /usr/bin
chmod 755 /usr/bin/cpanm
cat > /root/list << EOF
install DBD::mysql
EOF
for package in `cat /root/list`
do
cpanm $package
done
tar all的安裝方式:
tar -zxf mha4mysql-node-0.56.tar.gz
cd mha4mysql-node-0.56
perl Makefile.PL
make
make install
三. 安裝 MHA Manager
# yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y
# rpm -ivh mha4mysql-node-0.56-0.el5.noarch.rpm
# rpm -ivh mha4mysql-manager-0.56-0.el5.noarch.rpm
注:
上面有些包需要先安裝附加軟件包(EPEL)才能使用yum安裝,
MHA Manager另一種安裝方式:
MHA Manager 0.56 tar all源碼安裝
tar -zxf mha4mysql-manager-0.56.tar.gz
cd mha4mysql-manager-0.56
perl Makefile.PL
make
make install
4、MHA Manager 端配置
各參數含義:https://code.google.com/p/mysql-master-ha/wiki/Parameters#no_master
MHA Manager端配置,建議使用root操作系統用戶執行,因爲涉及到vip 啓停。
#mkdir -p /etc/masterha/app1
#vi /etc/masterha/app1/app1.cnf
[server default]
manager_workdir=/etc/masterha/app1
manager_log=/etc/masterha/app1/manager.log
user=root
password=root123
ssh_user=root
repl_user=slave001
repl_password=slave001
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 10.100.251.222 -s 10.100.251.223 --user=root --master_host=mytest01 --master_ip=10.100.251.221 --master_port=3306
ping_interval=3
master_ip_failover_script=/etc/masterha/app1/master_ip_failover #master failover時執行,不配置vip時不用配
#shutdown_script=/etc/masterha/power_manager
report_script=/etc/masterha/app1/send_report #master failover時執行,可選
master_ip_online_change_script=/etc/masterha/app1/master_ip_online_change #masterswitchover時執行,不配置vip時不用配
[server1]
hostname=10.100.251.221
port=3306
master_binlog_dir=/usr/local/mysql/data/
candidate_master=1
check_repl_delay=0
[server2]
hostname=10.100.251.222
port=3306
master_binlog_dir=/usr/local/mysql/data/
candidate_master=1 #如果候選master有延遲的話,relay日誌超過100m,failover切換不能成功,加上此參數後會忽略延遲日誌大小。
check_repl_delay=0
[server3]
hostname=10.100.251.223
port=3306
master_binlog_dir=/usr/local/mysql/data/
ignore_fail=1 #如果這個節點掛了,mha將不可用,加上這個參數,slave掛了一樣可以用
no_master=1 #從不將這臺主機轉換爲master
二. 檢查SSH配置
masterha_check_ssh --conf=/etc/masterha/app1/app1.cnf
#masterha_check_ssh --conf=/etc/masterha/app1/app1.cnf
Mon May 22 11:27:51 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon May 22 11:27:51 2017 - [info] Reading application default configuration from /etc/masterha/app1/app1.cnf..
Mon May 22 11:27:51 2017 - [info] Reading server configuration from /etc/masterha/app1/app1.cnf..
Mon May 22 11:27:51 2017 - [info] Starting SSH connection tests..
Mon May 22 11:27:52 2017 - [debug]
Mon May 22 11:27:51 2017 - [debug] Connecting via SSH from [email protected](10.100.251.221:22) to [email protected](10.100.251.222:22)..
Mon May 22 11:27:51 2017 - [debug] ok.
Mon May 22 11:27:51 2017 - [debug] Connecting via SSH from [email protected](10.100.251.221:22) to [email protected](10.100.251.223:22)..
Mon May 22 11:27:52 2017 - [debug] ok.
Mon May 22 11:27:52 2017 - [debug]
Mon May 22 11:27:51 2017 - [debug] Connecting via SSH from [email protected](10.100.251.222:22) to [email protected](10.100.251.221:22)..
Mon May 22 11:27:51 2017 - [debug] ok.
Mon May 22 11:27:51 2017 - [debug] Connecting via SSH from [email protected](10.100.251.222:22) to [email protected](10.100.251.223:22)..
Mon May 22 11:27:52 2017 - [debug] ok.
Mon May 22 11:27:53 2017 - [debug]
Mon May 22 11:27:52 2017 - [debug] Connecting via SSH from [email protected](10.100.251.223:22) to [email protected](10.100.251.221:22)..
Mon May 22 11:27:52 2017 - [debug] ok.
Mon May 22 11:27:52 2017 - [debug] Connecting via SSH from [email protected](10.100.251.223:22) to [email protected](10.100.251.222:22)..
Mon May 22 11:27:53 2017 - [debug] ok.
Mon May 22 11:27:53 2017 - [info] All SSH connection tests passed successfully.
成功!
3. 檢查MHA配置
masterha_check_repl --conf=/etc/masterha/app1/app1.cnf
[root@mytest01 data]# masterha_check_repl --conf=/etc/masterha/app1/app1.cnf
Mon May 22 13:23:53 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon May 22 13:23:53 2017 - [info] Reading application default configuration from /etc/masterha/app1/app1.cnf..
Mon May 22 13:23:53 2017 - [info] Reading server configuration from /etc/masterha/app1/app1.cnf..
Mon May 22 13:23:53 2017 - [info] MHA::MasterMonitor version 0.56.
Mon May 22 13:23:54 2017 - [info] GTID failover mode = 0
Mon May 22 13:23:54 2017 - [info] Dead Servers:
Mon May 22 13:23:54 2017 - [info] Alive Servers:
Mon May 22 13:23:54 2017 - [info] 10.100.251.221(10.100.251.221:3306)
Mon May 22 13:23:54 2017 - [info] 10.100.251.222(10.100.251.222:3306)
Mon May 22 13:23:54 2017 - [info] 10.100.251.223(10.100.251.223:3306)
Mon May 22 13:23:54 2017 - [info] Alive Slaves:
Mon May 22 13:23:54 2017 - [info] 10.100.251.222(10.100.251.222:3306) Version=5.6.28-log (oldest major version between slaves) log-bin:enabled
Mon May 22 13:23:54 2017 - [info] Replicating from 10.100.251.221(10.100.251.221:3306)
Mon May 22 13:23:54 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon May 22 13:23:54 2017 - [info] 10.100.251.223(10.100.251.223:3306) Version=5.6.28-log (oldest major version between slaves) log-bin:enabled
Mon May 22 13:23:54 2017 - [info] Replicating from 10.100.251.221(10.100.251.221:3306)
Mon May 22 13:23:54 2017 - [info] Not candidate for the new Master (no_master is set)
Mon May 22 13:23:54 2017 - [info] Current Alive Master: 10.100.251.221(10.100.251.221:3306)
Mon May 22 13:23:54 2017 - [info] Checking slave configurations..
Mon May 22 13:23:54 2017 - [info] Checking replication filtering settings..
Mon May 22 13:23:54 2017 - [info] binlog_do_db= , binlog_ignore_db=
Mon May 22 13:23:54 2017 - [info] Replication filtering check ok.
Mon May 22 13:23:54 2017 - [info] GTID (with auto-pos) is not supported
Mon May 22 13:23:54 2017 - [info] Starting SSH connection tests..
Mon May 22 13:23:56 2017 - [info] All SSH connection tests passed successfully.
Mon May 22 13:23:56 2017 - [info] Checking MHA Node version..
Mon May 22 13:23:56 2017 - [info] Version check ok.
Mon May 22 13:23:56 2017 - [info] Checking SSH publickey authentication settings on the current master..
Mon May 22 13:23:57 2017 - [info] HealthCheck: SSH to 10.100.251.221 is reachable.
Mon May 22 13:23:57 2017 - [info] Master MHA Node version is 0.56.
Mon May 22 13:23:57 2017 - [info] Checking recovery script configurations on 10.100.251.221(10.100.251.221:3306)..
Mon May 22 13:23:57 2017 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/usr/local/mysql/data --output_file=/var/tmp/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000009
Mon May 22 13:23:57 2017 - [info] Connecting to [email protected](10.100.251.221:22)..
Creating /var/tmp if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /usr/local/mysql/data, up to mysql-bin.000009
Mon May 22 13:23:57 2017 - [info] Binlog setting check done.
Mon May 22 13:23:57 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Mon May 22 13:23:57 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=10.100.251.222 --slave_ip=10.100.251.222 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.28-log --manager_version=0.56 --relay_log_info=/usr/local/mysql/data/relay-log.info --relay_dir=/usr/local/mysql/data/ --slave_pass=xxx
Mon May 22 13:23:57 2017 - [info] Connecting to [email protected](10.100.251.222:22)..
Checking slave recovery environment settings..
Opening /usr/local/mysql/data/relay-log.info ... ok.
Relay log found at /usr/local/mysql/data, up to mysql-relay-bin.000003
Temporary relay log file is /usr/local/mysql/data/mysql-relay-bin.000003
Testing mysql connection and privileges.. done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Mon May 22 13:23:58 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=10.100.251.223 --slave_ip=10.100.251.223 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.28-log --manager_version=0.56 --relay_log_info=/usr/local/mysql/data/relay-log.info --relay_dir=/usr/local/mysql/data/ --slave_pass=xxx
Mon May 22 13:23:58 2017 - [info] Connecting to [email protected](10.100.251.223:22)..
Checking slave recovery environment settings..
Opening /usr/local/mysql/data/relay-log.info ... ok.
Relay log found at /usr/local/mysql/data, up to mysql-relay-bin.000007
Temporary relay log file is /usr/local/mysql/data/mysql-relay-bin.000007
Testing mysql connection and privileges.. done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Mon May 22 13:23:58 2017 - [info] Slaves settings check done.
Mon May 22 13:23:58 2017 - [info]
10.100.251.221(10.100.251.221:3306) (current master)
+--10.100.251.222(10.100.251.222:3306)
+--10.100.251.223(10.100.251.223:3306)
Mon May 22 13:23:58 2017 - [info] Checking replication health on 10.100.251.222..
Mon May 22 13:23:58 2017 - [info] ok.
Mon May 22 13:23:58 2017 - [info] Checking replication health on 10.100.251.223..
Mon May 22 13:23:58 2017 - [info] ok.
Mon May 22 13:23:58 2017 - [warning] master_ip_failover_script is not defined.
Mon May 22 13:23:58 2017 - [warning] shutdown_script is not defined.
Mon May 22 13:23:58 2017 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
4. MHA Manager 端日常主要操作步驟
① 檢查是否有下列文件,有則刪除。
發生主從切換後,MHAmanager服務會自動停掉,且在manager_workdir目錄下面生成文件app1.failover.complete,若要啓動MHA,必須先確保無此文件
# ll /etc/masterha/app1/app1.failover.complete
# ll /etc/masterha/app1/app1.failover.error
② 檢查MHA當前置:
# masterha_check_repl --conf=/etc/masterha/app1/app1.cnf
③ 啓動MHA:
# nohup masterha_manager --conf=/etc/masterha/app1/app1.cnf > /etc/masterha/app1/mha_manager.log 2>&1 &
當有slave 節點宕掉時,默認是啓動不了的,加上 --ignore_fail_on_start 即使有節點宕掉也能啓動MHA,如下:
# nohup masterha_manager --conf=/etc/masterha/app1/app1.cnf --ignore_fail_on_start >/etc/masterha/app1/mha_manager.log 2>&1 &
④ 檢查狀態:
# masterha_check_status --conf=/etc/masterha/app1/app1.cnf
⑤ 檢查日誌:
#tail -f /etc/masterha/app1/manager.log
⑥ 主從切換後續工作
主庫切換後,把原主庫修復成新從庫,然後重新執行以上步驟。原主庫數據文件完整的情況下,可通過以下方式找出最後執行的CHANGE MASTER命令:
# grep "CHANGE MASTER TO MASTER" /etc/masterha/app1/manager.log | tail -1
CHANGE MASTER TO MASTER_HOST='10.100.251.222',MASTER_PORT=3306, MASTER_LOG_FILE='master-bin.000001', MASTER_LOG_POS=120,MASTER_USER='slave001', MASTER_PASSWORD='xxx';
--最後啓動新從庫
# start slave;
# show slave status\G
5、Failover應用場景測試
自動failover測試
應用場景1:
master dead後,MHA當時已經開啓,候選Master庫(Slave)會自動failover爲Master.
--shutdown mysql master node
# service mysql stop
--check new master node
mysql> show master status\G;
*************************** 1.row ***************************
File: master-bin.000001
Position: 330
--check slave node
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.100.251.222
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: binlog.000010
Read_Master_Log_Pos: 360
Relay_Log_File: mariadbtest03-relay-bin.000002
Relay_Log_Pos: 534
Relay_Master_Log_File: binlog.000010
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
--check manager.log
[root@mariadbtest03 ~]# tail -100f /etc/masterha/app1/manager.log
----- Failover Report -----
app1: MySQL Master failover mariadbtest01(10.100.251.221:3306) to mariadbtest02(10.100.251.222:3306) succeeded
Master mariadbtest01(10.100.251.221:3306) is down!
Check MHA Manager logs at mariadbtest03 for details.
Started automated(non-interactive) failover.
Invalidated master IP address on mariadbtest01(10.100.251.221:3306)
The latest slave mariadbtest02(10.100.251.222:3306) has all relay logs for recovery.
Selected mariadbtest02(10.100.251.222:3306) as a new master.
mariadbtest02(10.100.251.222:3306): OK: Applying all logs succeeded.
mariadbtest02(10.100.251.222:3306): OK: Activated master IP address.
mariadbtest03(10.100.251.223:3306): This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
mariadbtest03(10.100.251.223:3306): OK: Applying all logs succeeded. Slave started, replicating from mariadbtest02(10.100.251.222:3306)
mariadbtest02(10.100.251.222:3306): Resetting slave info succeeded.
Master failover to mariadbtest02(10.100.251.222:3306) completed successfully.
--最後把原主庫修復成一個新的slave
#grep "CHANGE MASTER TO MASTER" /etc/masterha/app1/manager.log | tail -1
CHANGE MASTER TO MASTER_HOST='10.100.251.222', MASTER_PORT=3306, MASTER_LOG_FILE='binlog.000010', MASTER_LOG_POS=360, MASTER_USER='slave001', MASTER_PASSWORD='xxx';
mysql>CHANGE MASTER TO MASTER_HOST='10.100.251.222', MASTER_PORT=3306, MASTER_LOG_FILE='binlog.000010', MASTER_LOG_POS=360, MASTER_USER='slave001', MASTER_PASSWORD='slave001';
Query OK, 0 rows affected, 2warnings (0.17 sec)
mysql> start slave;
mysql> show slave status\G;
*************************** 1.row ***************************
Slave_IO_State: Waiting formaster to send event
Master_Host: 10.100.251.222
Master_User: slave001
手動Failover
應用場景2:master dead,但是當時MHA沒有開啓,可以通過手工failover。
1.檢查是否有下列文件,有則刪除。
# ll /etc/masterha/app1/app1.failover.complete
# ll /etc/masterha/app1/app1.failover.error
2. 如果MHA在運行,需先停止MHA:masterha_stop --conf=/etc/masterha/app1/app1.cnf
3. 檢查MHA當前置:masterha_check_repl --conf=/etc/masterha/app1/app1.cnf
4. 手動切換:masterha_master_switch--conf=/etc/masterha/app1/app1.cnf --master_state=dead --dead_master_host=10.100.251.222 --dead_master_port=3306
# 接以上的
以下爲切換時指定了new_master_host和new_master_port,如果不指定new_master_host,則根據配置文件app1.cnf選出new_master_host,但new_master_port默認是3306。
# masterha_master_switch --conf=/etc/masterha/app1/app1.cnf --master_state=dead --dead_master_host=10.100.251.222 --dead_master_port=3306 --new_master_host=10.100.251.221 --new_master_port=3306
應用場景3
Scheduled(Online) Master Switch(手動在線主庫切換)
master和slave正常,MHA正常開啓,維護操作時(例如更換新主機硬件、添加/刪除列或主鍵)手動在線切換master到其他主機。
1. 如果MHA在運行,需先停止MHA
masterha_stop --conf=/etc/masterha/app1/app1.cnf
2. 檢查MHA當前置
masterha_check_repl --conf=/etc/masterha/app1/app1.cnf
3. 手動切換
masterha_master_switch --master_state=alive --conf=/etc/masterha/app1/app1.cnf --orig_master_is_new_slave --running_updates_limit=3600 --interactive=0
注意:執行masterha_master_switch調用的不是master_ip_failover_script腳本,而是master_ip_online_change_script腳本,可把啓動和停止VIP放到這個腳本中,如果沒有配置VIP,則需要手動執行VIP切換,如下:
ssh root@$orig_master_ip /sbin/ifconfig eth0:1 down
ssh root@$new_master_ip /sbin/ifconfig eth0:1 10.100.251.228/24
以下爲切換時指定了new_master_host和new_master_port,如果不指定new_master_host,則根據配置文件app1.cnf選出new_master_host,但new_master_port默認是3306。
masterha_master_switch --master_state=alive --conf=/etc/masterha/app1/app1.cnf --orig_master_is_new_slave --running_updates_limit=3600 --interactive=0 --new_master_host=10.100.251.222 --new_master_port=3306
參數 --running_updates_limit 如果現在的master執行寫操作的執行時間大於這個參數,或者任何一臺slave的Seconds_Behind_Master大於這個參數,那麼master switch將自動放棄。默認參數爲1s
參數 --interactive=0 非交互切換,建議加上,可大大加快切換速度,加上後庫不忙時大概3秒內切換完成。
7、定期刪除中繼日誌
由於在第一步中,每個slave上設置了參數relay_log_purge=0,所以slave節點需要定期刪除中繼日誌,建議每個slave節點刪除中繼日誌的時間錯開。
corntab -e
0 5 * * * /usr/bin/purge_relay_logs --user=root--password=123456 --port=3306 --disable_relay_log_purge >> /var/lib/mysql/purge_relay.log 2>&1
8、附錄腳本:
#######master_ip_failover:
#!/usr/bin/env perl
use strict;
use warnings FATAL =>'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '10.100.251.228/24'; # Virtual IP
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig eth1:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig eth1:$key down";
my $exit_code = 0;
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
#print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "\n\n\n***************************************************************\n";
print "Disabling the VIP - $vip on old master: $orig_master_host\n";
print "***************************************************************\n\n\n\n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
print "\n\n\n***************************************************************\n";
print "Enabling the VIP - $vip on new master: $new_master_host \n";
print "***************************************************************\n\n\n\n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
`ssh $ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}
}
# A simple system call that enable the VIP on the new master
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover –command=start|stop|stopssh|status –orig_master_host=host –orig_master_ip=ip –orig_master_port=po
rt –new_master_host=host –new_master_ip=ip –new_master_port=port\n";
}
##########master_ip_online_change
#!/usr/bin/env perl
use strict;
use warnings FATAL =>'all';
use Getopt::Long;
my $vip = '10.100.251.228/24'; # Virtual IP
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig eth1:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig eth1:$key down";
my $exit_code = 0;
my (
$command, $orig_master_is_new_slave, $orig_master_host,
$orig_master_ip, $orig_master_port, $orig_master_user,
$orig_master_password, $orig_master_ssh_user, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password, $new_master_ssh_user,
);
GetOptions(
'command=s' => \$command,
'orig_master_is_new_slave' => \$orig_master_is_new_slave,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'orig_master_user=s' => \$orig_master_user,
'orig_master_password=s' => \$orig_master_password,
'orig_master_ssh_user=s' => \$orig_master_ssh_user,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
'new_master_ssh_user=s' => \$new_master_ssh_user,
);
exit &main();
sub main {
#print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "\n\n\n***************************************************************\n";
print "Disabling the VIP - $vip on old master: $orig_master_host\n";
print "***************************************************************\n\n\n\n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
print "\n\n\n***************************************************************\n";
print "Enabling the VIP - $vip on new master: $new_master_host \n";
print "***************************************************************\n\n\n\n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}
}
# A simple system call that enable the VIP on the new master
sub start_vip() {
`ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover –command=start|stop|stopssh|status –orig_master_host=host –orig_master_ip=ip –orig_master_port=po
rt –new_master_host=host –new_master_ip=ip –new_master_port=port\n";
}
###########send_report
#/bin/bash
source /root/.bash_profile
orig_master_host=`echo "$1" | awk -F = '{print $2}'`
new_master_host=`echo "$2" | awk -F = '{print $2}'`
new_slave_hosts=`echo "$3" | awk -F = '{print $2}'`
subject=`echo "$4" | awk -F = '{print $2}'`
body=`echo "$5" | awk -F = '{print $2}'`
#判斷日誌結尾是否有successfully,有則表示切換成功,成功與否都發郵件。
tac /etc/masterha/app1/manager.log | sed -n 2p | grep 'successfully' > /dev/null
if [ $? -eq 0 ]
then
echo -e "MHA $subject 主從切換成功\n master:$orig_master_host --> $new_master_host \n $body \n 當前從庫:$new_slave_hosts" | mutt
-s "MySQL實例宕掉,MHA $subject 切換成功" -- [email protected]
else
echo -e "MHA $subject 主從切換失敗\n master:$orig_master_host --> $new_master_host \n $body" | mutt -s "MySQL實例宕掉,MHA $subje
ct 切換失敗" -- [email protected]
fi
環境說明:
Master:10.100.251.221:3306
Slave1:10.100.251.222:3306 (候選master)
Slave2:10.100.251.223:3306 (+MHA Manager)
vip:10.100.251.228
1、配置主從同步
一、主服務器
1.1、創建一個複製用戶,具有replication slave 權限。
mysql> grant replication slave on *.* to 'slave001'@'%' identified by 'slave001';
1.2、編輯my.cnf文件
vi /etc/my.cnf
添加:
server-id=221
1.3、並開啓log-bin二進制日誌文件
log-bin=mysql-bin
1.4、啓動mysql數據庫
service mysql start
1.5、得到binlog日誌文件名和偏移量
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000007 | 551 | | | |
+------------------+----------+--------------+------------------+-------------------+
1.6、備份要同步的數據庫
mysqldump -uroot -p --lock-tables --events --triggers --routines --flush-logs --master-data=2 --databases repdb > /tmp/backup/db.sql
或:
mysqldump --master-data=2 --single-transaction --events -R --flush-logs --triggers --databases repdb > all.sql
1.7、查看position
[root@mytest01 backup]# grep MASTER /tmp/backup/db.sql
-- CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000009', MASTER_LOG_POS=120;
[root@mytest01 data]#
二、從服務器
2.0、創建一個複製用戶,具有replication slave 權限。
mysql> grant replication slave on *.* to 'slave001'@'%' identified by 'slave001';
2.1、編輯my.cnf文件
vi /etc/my.cnf
添加
server-id=222
log-bin = mysql-bin
log-bin-index = mysql-bin.index
read_only=1
relay_log_purge=0 #(一主一叢不需要此項,兩從及以上建議開次參數,防止切換爲成主庫的從庫自動刪除中繼日誌後,無法給其他從庫應用這部分日誌)
注:需要把默認的server-id=1去掉
不要嘗試把master配置屬性寫在my.cnf 中,5.1.7以後,mysql已經不支持這樣做了
2.2、啓動從數據庫
service mysql start
2.3、把生產的數據導進從服務器:
mysql -uroot -proot123</tmp/backup/db.sql
2.3、對從數據庫進行相應設置
mysql> change master to
master_host='10.100.251.221',
master_user='slave001',
master_password='slave001',
master_log_file='mysql-bin.000009',
master_log_pos=120;
2.4、啓動從服務器slave線程
mysql>start slave;
執行show processlist命令顯示以下進程:
mysql>show processlist\G
*************************** 2. row ***************************
Id: 2
User: system user
Host:
db: NULL
Command: Connect
Time: 2579
State: Has read all relay log; waiting for the slave I/O thread to update it
Info: NULL表示slave已經連接上master,開始接受並執行日誌
2.5、查看slave線程狀態
mysql>show slave status;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.100.251.221
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.0000010
Read_Master_Log_Pos: 106
Relay_Log_File: centos-relay-bin.000002
Relay_Log_Pos: 529
Relay_Master_Log_File: mysql-bin.0000010
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 106
Relay_Log_Space: 830
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
1 row in set (0.00 sec)
驗證是否配置正確
在從服務器上執行
mysql> show slave status\G
Waiting for master to send event
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
如以上二行同時爲Yes 說明配置成功
PS:show slave status\G 後不要添加 ; 號, 不然會出 ERROR
2、配置ssh公鑰互信
一、本例中manager節點和node節點ip
manager:10.100.251.223
node1:10.100.251.221
node2:10.100.251.222
node3:10.100.251.223
注:manager節點可以安裝獨立的服務器上,本例爲了節省機器,manager安裝在了主庫(10.100.251.223)上.
二. 配置manager和node各節點間的root用戶的ssh公鑰互信
配置ssh免密碼連入
注意要以root用戶登錄,在root用戶的主目錄下進行操作。--一臺服務器上操作就可以
# ssh-keygen
# ssh-copy-id -p 22 [email protected]
# ssh-copy-id -p 22 [email protected]
# ssh-copy-id -p 22 [email protected]
3、安裝 MHA 包
一. 下載MHA安裝包
下載網址:
https://code.google.com/p/mysql-master-ha/wiki/Downloads?tm=2
下載rpm包或tar all均可,建議用rpm包,因爲安裝簡單。
二. 安裝 MHA Node
在manager和node的所有節點均需安裝MHA Node。
rpm安裝方式:
# yum install perl-DBD-MySQL
# rpm -ivh mha4mysql-node-0.56-0.el5.noarch.rpm
如果有依賴:
[[email protected] ~]# cat install.sh
#!/bin/bash
wget http://xrl.us/cpanm --no-check-certificate
mv cpanm /usr/bin
chmod 755 /usr/bin/cpanm
cat > /root/list << EOF
install DBD::mysql
EOF
for package in `cat /root/list`
do
cpanm $package
done
tar all的安裝方式:
tar -zxf mha4mysql-node-0.56.tar.gz
cd mha4mysql-node-0.56
perl Makefile.PL
make
make install
三. 安裝 MHA Manager
# yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y
# rpm -ivh mha4mysql-node-0.56-0.el5.noarch.rpm
# rpm -ivh mha4mysql-manager-0.56-0.el5.noarch.rpm
注:
上面有些包需要先安裝附加軟件包(EPEL)才能使用yum安裝,
MHA Manager另一種安裝方式:
MHA Manager 0.56 tar all源碼安裝
tar -zxf mha4mysql-manager-0.56.tar.gz
cd mha4mysql-manager-0.56
perl Makefile.PL
make
make install
4、MHA Manager 端配置
各參數含義:https://code.google.com/p/mysql-master-ha/wiki/Parameters#no_master
MHA Manager端配置,建議使用root操作系統用戶執行,因爲涉及到vip 啓停。
#mkdir -p /etc/masterha/app1
#vi /etc/masterha/app1/app1.cnf
[server default]
manager_workdir=/etc/masterha/app1
manager_log=/etc/masterha/app1/manager.log
user=root
password=root123
ssh_user=root
repl_user=slave001
repl_password=slave001
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 10.100.251.222 -s 10.100.251.223 --user=root --master_host=mytest01 --master_ip=10.100.251.221 --master_port=3306
ping_interval=3
master_ip_failover_script=/etc/masterha/app1/master_ip_failover #master failover時執行,不配置vip時不用配
#shutdown_script=/etc/masterha/power_manager
report_script=/etc/masterha/app1/send_report #master failover時執行,可選
master_ip_online_change_script=/etc/masterha/app1/master_ip_online_change #masterswitchover時執行,不配置vip時不用配
[server1]
hostname=10.100.251.221
port=3306
master_binlog_dir=/usr/local/mysql/data/
candidate_master=1
check_repl_delay=0
[server2]
hostname=10.100.251.222
port=3306
master_binlog_dir=/usr/local/mysql/data/
candidate_master=1 #如果候選master有延遲的話,relay日誌超過100m,failover切換不能成功,加上此參數後會忽略延遲日誌大小。
check_repl_delay=0
[server3]
hostname=10.100.251.223
port=3306
master_binlog_dir=/usr/local/mysql/data/
ignore_fail=1 #如果這個節點掛了,mha將不可用,加上這個參數,slave掛了一樣可以用
no_master=1 #從不將這臺主機轉換爲master
二. 檢查SSH配置
masterha_check_ssh --conf=/etc/masterha/app1/app1.cnf
#masterha_check_ssh --conf=/etc/masterha/app1/app1.cnf
Mon May 22 11:27:51 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon May 22 11:27:51 2017 - [info] Reading application default configuration from /etc/masterha/app1/app1.cnf..
Mon May 22 11:27:51 2017 - [info] Reading server configuration from /etc/masterha/app1/app1.cnf..
Mon May 22 11:27:51 2017 - [info] Starting SSH connection tests..
Mon May 22 11:27:52 2017 - [debug]
Mon May 22 11:27:51 2017 - [debug] Connecting via SSH from [email protected](10.100.251.221:22) to [email protected](10.100.251.222:22)..
Mon May 22 11:27:51 2017 - [debug] ok.
Mon May 22 11:27:51 2017 - [debug] Connecting via SSH from [email protected](10.100.251.221:22) to [email protected](10.100.251.223:22)..
Mon May 22 11:27:52 2017 - [debug] ok.
Mon May 22 11:27:52 2017 - [debug]
Mon May 22 11:27:51 2017 - [debug] Connecting via SSH from [email protected](10.100.251.222:22) to [email protected](10.100.251.221:22)..
Mon May 22 11:27:51 2017 - [debug] ok.
Mon May 22 11:27:51 2017 - [debug] Connecting via SSH from [email protected](10.100.251.222:22) to [email protected](10.100.251.223:22)..
Mon May 22 11:27:52 2017 - [debug] ok.
Mon May 22 11:27:53 2017 - [debug]
Mon May 22 11:27:52 2017 - [debug] Connecting via SSH from [email protected](10.100.251.223:22) to [email protected](10.100.251.221:22)..
Mon May 22 11:27:52 2017 - [debug] ok.
Mon May 22 11:27:52 2017 - [debug] Connecting via SSH from [email protected](10.100.251.223:22) to [email protected](10.100.251.222:22)..
Mon May 22 11:27:53 2017 - [debug] ok.
Mon May 22 11:27:53 2017 - [info] All SSH connection tests passed successfully.
成功!
3. 檢查MHA配置
masterha_check_repl --conf=/etc/masterha/app1/app1.cnf
[root@mytest01 data]# masterha_check_repl --conf=/etc/masterha/app1/app1.cnf
Mon May 22 13:23:53 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon May 22 13:23:53 2017 - [info] Reading application default configuration from /etc/masterha/app1/app1.cnf..
Mon May 22 13:23:53 2017 - [info] Reading server configuration from /etc/masterha/app1/app1.cnf..
Mon May 22 13:23:53 2017 - [info] MHA::MasterMonitor version 0.56.
Mon May 22 13:23:54 2017 - [info] GTID failover mode = 0
Mon May 22 13:23:54 2017 - [info] Dead Servers:
Mon May 22 13:23:54 2017 - [info] Alive Servers:
Mon May 22 13:23:54 2017 - [info] 10.100.251.221(10.100.251.221:3306)
Mon May 22 13:23:54 2017 - [info] 10.100.251.222(10.100.251.222:3306)
Mon May 22 13:23:54 2017 - [info] 10.100.251.223(10.100.251.223:3306)
Mon May 22 13:23:54 2017 - [info] Alive Slaves:
Mon May 22 13:23:54 2017 - [info] 10.100.251.222(10.100.251.222:3306) Version=5.6.28-log (oldest major version between slaves) log-bin:enabled
Mon May 22 13:23:54 2017 - [info] Replicating from 10.100.251.221(10.100.251.221:3306)
Mon May 22 13:23:54 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon May 22 13:23:54 2017 - [info] 10.100.251.223(10.100.251.223:3306) Version=5.6.28-log (oldest major version between slaves) log-bin:enabled
Mon May 22 13:23:54 2017 - [info] Replicating from 10.100.251.221(10.100.251.221:3306)
Mon May 22 13:23:54 2017 - [info] Not candidate for the new Master (no_master is set)
Mon May 22 13:23:54 2017 - [info] Current Alive Master: 10.100.251.221(10.100.251.221:3306)
Mon May 22 13:23:54 2017 - [info] Checking slave configurations..
Mon May 22 13:23:54 2017 - [info] Checking replication filtering settings..
Mon May 22 13:23:54 2017 - [info] binlog_do_db= , binlog_ignore_db=
Mon May 22 13:23:54 2017 - [info] Replication filtering check ok.
Mon May 22 13:23:54 2017 - [info] GTID (with auto-pos) is not supported
Mon May 22 13:23:54 2017 - [info] Starting SSH connection tests..
Mon May 22 13:23:56 2017 - [info] All SSH connection tests passed successfully.
Mon May 22 13:23:56 2017 - [info] Checking MHA Node version..
Mon May 22 13:23:56 2017 - [info] Version check ok.
Mon May 22 13:23:56 2017 - [info] Checking SSH publickey authentication settings on the current master..
Mon May 22 13:23:57 2017 - [info] HealthCheck: SSH to 10.100.251.221 is reachable.
Mon May 22 13:23:57 2017 - [info] Master MHA Node version is 0.56.
Mon May 22 13:23:57 2017 - [info] Checking recovery script configurations on 10.100.251.221(10.100.251.221:3306)..
Mon May 22 13:23:57 2017 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/usr/local/mysql/data --output_file=/var/tmp/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000009
Mon May 22 13:23:57 2017 - [info] Connecting to [email protected](10.100.251.221:22)..
Creating /var/tmp if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /usr/local/mysql/data, up to mysql-bin.000009
Mon May 22 13:23:57 2017 - [info] Binlog setting check done.
Mon May 22 13:23:57 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Mon May 22 13:23:57 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=10.100.251.222 --slave_ip=10.100.251.222 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.28-log --manager_version=0.56 --relay_log_info=/usr/local/mysql/data/relay-log.info --relay_dir=/usr/local/mysql/data/ --slave_pass=xxx
Mon May 22 13:23:57 2017 - [info] Connecting to [email protected](10.100.251.222:22)..
Checking slave recovery environment settings..
Opening /usr/local/mysql/data/relay-log.info ... ok.
Relay log found at /usr/local/mysql/data, up to mysql-relay-bin.000003
Temporary relay log file is /usr/local/mysql/data/mysql-relay-bin.000003
Testing mysql connection and privileges.. done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Mon May 22 13:23:58 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=10.100.251.223 --slave_ip=10.100.251.223 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.28-log --manager_version=0.56 --relay_log_info=/usr/local/mysql/data/relay-log.info --relay_dir=/usr/local/mysql/data/ --slave_pass=xxx
Mon May 22 13:23:58 2017 - [info] Connecting to [email protected](10.100.251.223:22)..
Checking slave recovery environment settings..
Opening /usr/local/mysql/data/relay-log.info ... ok.
Relay log found at /usr/local/mysql/data, up to mysql-relay-bin.000007
Temporary relay log file is /usr/local/mysql/data/mysql-relay-bin.000007
Testing mysql connection and privileges.. done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Mon May 22 13:23:58 2017 - [info] Slaves settings check done.
Mon May 22 13:23:58 2017 - [info]
10.100.251.221(10.100.251.221:3306) (current master)
+--10.100.251.222(10.100.251.222:3306)
+--10.100.251.223(10.100.251.223:3306)
Mon May 22 13:23:58 2017 - [info] Checking replication health on 10.100.251.222..
Mon May 22 13:23:58 2017 - [info] ok.
Mon May 22 13:23:58 2017 - [info] Checking replication health on 10.100.251.223..
Mon May 22 13:23:58 2017 - [info] ok.
Mon May 22 13:23:58 2017 - [warning] master_ip_failover_script is not defined.
Mon May 22 13:23:58 2017 - [warning] shutdown_script is not defined.
Mon May 22 13:23:58 2017 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
4. MHA Manager 端日常主要操作步驟
① 檢查是否有下列文件,有則刪除。
發生主從切換後,MHAmanager服務會自動停掉,且在manager_workdir目錄下面生成文件app1.failover.complete,若要啓動MHA,必須先確保無此文件
# ll /etc/masterha/app1/app1.failover.complete
# ll /etc/masterha/app1/app1.failover.error
② 檢查MHA當前置:
# masterha_check_repl --conf=/etc/masterha/app1/app1.cnf
③ 啓動MHA:
# nohup masterha_manager --conf=/etc/masterha/app1/app1.cnf > /etc/masterha/app1/mha_manager.log 2>&1 &
當有slave 節點宕掉時,默認是啓動不了的,加上 --ignore_fail_on_start 即使有節點宕掉也能啓動MHA,如下:
# nohup masterha_manager --conf=/etc/masterha/app1/app1.cnf --ignore_fail_on_start >/etc/masterha/app1/mha_manager.log 2>&1 &
④ 檢查狀態:
# masterha_check_status --conf=/etc/masterha/app1/app1.cnf
⑤ 檢查日誌:
#tail -f /etc/masterha/app1/manager.log
⑥ 主從切換後續工作
主庫切換後,把原主庫修復成新從庫,然後重新執行以上步驟。原主庫數據文件完整的情況下,可通過以下方式找出最後執行的CHANGE MASTER命令:
# grep "CHANGE MASTER TO MASTER" /etc/masterha/app1/manager.log | tail -1
CHANGE MASTER TO MASTER_HOST='10.100.251.222',MASTER_PORT=3306, MASTER_LOG_FILE='master-bin.000001', MASTER_LOG_POS=120,MASTER_USER='slave001', MASTER_PASSWORD='xxx';
--最後啓動新從庫
# start slave;
# show slave status\G
5、Failover應用場景測試
自動failover測試
應用場景1:
master dead後,MHA當時已經開啓,候選Master庫(Slave)會自動failover爲Master.
--shutdown mysql master node
# service mysql stop
--check new master node
mysql> show master status\G;
*************************** 1.row ***************************
File: master-bin.000001
Position: 330
--check slave node
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.100.251.222
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: binlog.000010
Read_Master_Log_Pos: 360
Relay_Log_File: mariadbtest03-relay-bin.000002
Relay_Log_Pos: 534
Relay_Master_Log_File: binlog.000010
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
--check manager.log
[root@mariadbtest03 ~]# tail -100f /etc/masterha/app1/manager.log
----- Failover Report -----
app1: MySQL Master failover mariadbtest01(10.100.251.221:3306) to mariadbtest02(10.100.251.222:3306) succeeded
Master mariadbtest01(10.100.251.221:3306) is down!
Check MHA Manager logs at mariadbtest03 for details.
Started automated(non-interactive) failover.
Invalidated master IP address on mariadbtest01(10.100.251.221:3306)
The latest slave mariadbtest02(10.100.251.222:3306) has all relay logs for recovery.
Selected mariadbtest02(10.100.251.222:3306) as a new master.
mariadbtest02(10.100.251.222:3306): OK: Applying all logs succeeded.
mariadbtest02(10.100.251.222:3306): OK: Activated master IP address.
mariadbtest03(10.100.251.223:3306): This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
mariadbtest03(10.100.251.223:3306): OK: Applying all logs succeeded. Slave started, replicating from mariadbtest02(10.100.251.222:3306)
mariadbtest02(10.100.251.222:3306): Resetting slave info succeeded.
Master failover to mariadbtest02(10.100.251.222:3306) completed successfully.
--最後把原主庫修復成一個新的slave
#grep "CHANGE MASTER TO MASTER" /etc/masterha/app1/manager.log | tail -1
CHANGE MASTER TO MASTER_HOST='10.100.251.222', MASTER_PORT=3306, MASTER_LOG_FILE='binlog.000010', MASTER_LOG_POS=360, MASTER_USER='slave001', MASTER_PASSWORD='xxx';
mysql>CHANGE MASTER TO MASTER_HOST='10.100.251.222', MASTER_PORT=3306, MASTER_LOG_FILE='binlog.000010', MASTER_LOG_POS=360, MASTER_USER='slave001', MASTER_PASSWORD='slave001';
Query OK, 0 rows affected, 2warnings (0.17 sec)
mysql> start slave;
mysql> show slave status\G;
*************************** 1.row ***************************
Slave_IO_State: Waiting formaster to send event
Master_Host: 10.100.251.222
Master_User: slave001
手動Failover
應用場景2:master dead,但是當時MHA沒有開啓,可以通過手工failover。
1.檢查是否有下列文件,有則刪除。
# ll /etc/masterha/app1/app1.failover.complete
# ll /etc/masterha/app1/app1.failover.error
2. 如果MHA在運行,需先停止MHA:masterha_stop --conf=/etc/masterha/app1/app1.cnf
3. 檢查MHA當前置:masterha_check_repl --conf=/etc/masterha/app1/app1.cnf
4. 手動切換:masterha_master_switch--conf=/etc/masterha/app1/app1.cnf --master_state=dead --dead_master_host=10.100.251.222 --dead_master_port=3306
# 接以上的
以下爲切換時指定了new_master_host和new_master_port,如果不指定new_master_host,則根據配置文件app1.cnf選出new_master_host,但new_master_port默認是3306。
# masterha_master_switch --conf=/etc/masterha/app1/app1.cnf --master_state=dead --dead_master_host=10.100.251.222 --dead_master_port=3306 --new_master_host=10.100.251.221 --new_master_port=3306
應用場景3
Scheduled(Online) Master Switch(手動在線主庫切換)
master和slave正常,MHA正常開啓,維護操作時(例如更換新主機硬件、添加/刪除列或主鍵)手動在線切換master到其他主機。
1. 如果MHA在運行,需先停止MHA
masterha_stop --conf=/etc/masterha/app1/app1.cnf
2. 檢查MHA當前置
masterha_check_repl --conf=/etc/masterha/app1/app1.cnf
3. 手動切換
masterha_master_switch --master_state=alive --conf=/etc/masterha/app1/app1.cnf --orig_master_is_new_slave --running_updates_limit=3600 --interactive=0
注意:執行masterha_master_switch調用的不是master_ip_failover_script腳本,而是master_ip_online_change_script腳本,可把啓動和停止VIP放到這個腳本中,如果沒有配置VIP,則需要手動執行VIP切換,如下:
ssh root@$orig_master_ip /sbin/ifconfig eth0:1 down
ssh root@$new_master_ip /sbin/ifconfig eth0:1 10.100.251.228/24
以下爲切換時指定了new_master_host和new_master_port,如果不指定new_master_host,則根據配置文件app1.cnf選出new_master_host,但new_master_port默認是3306。
masterha_master_switch --master_state=alive --conf=/etc/masterha/app1/app1.cnf --orig_master_is_new_slave --running_updates_limit=3600 --interactive=0 --new_master_host=10.100.251.222 --new_master_port=3306
參數 --running_updates_limit 如果現在的master執行寫操作的執行時間大於這個參數,或者任何一臺slave的Seconds_Behind_Master大於這個參數,那麼master switch將自動放棄。默認參數爲1s
參數 --interactive=0 非交互切換,建議加上,可大大加快切換速度,加上後庫不忙時大概3秒內切換完成。
7、定期刪除中繼日誌
由於在第一步中,每個slave上設置了參數relay_log_purge=0,所以slave節點需要定期刪除中繼日誌,建議每個slave節點刪除中繼日誌的時間錯開。
corntab -e
0 5 * * * /usr/bin/purge_relay_logs --user=root--password=123456 --port=3306 --disable_relay_log_purge >> /var/lib/mysql/purge_relay.log 2>&1
8、附錄腳本:
#######master_ip_failover:
#!/usr/bin/env perl
use strict;
use warnings FATAL =>'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '10.100.251.228/24'; # Virtual IP
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig eth1:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig eth1:$key down";
my $exit_code = 0;
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
#print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "\n\n\n***************************************************************\n";
print "Disabling the VIP - $vip on old master: $orig_master_host\n";
print "***************************************************************\n\n\n\n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
print "\n\n\n***************************************************************\n";
print "Enabling the VIP - $vip on new master: $new_master_host \n";
print "***************************************************************\n\n\n\n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
`ssh $ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}
}
# A simple system call that enable the VIP on the new master
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
"Usage: master_ip_failover –command=start|stop|stopssh|status –orig_master_host=host –orig_master_ip=ip –orig_master_port=po
rt –new_master_host=host –new_master_ip=ip –new_master_port=port\n";
}
##########master_ip_online_change
#!/usr/bin/env perl
use strict;
use warnings FATAL =>'all';
use Getopt::Long;
my $vip = '10.100.251.228/24'; # Virtual IP
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig eth1:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig eth1:$key down";
my $exit_code = 0;
my (
$command, $orig_master_is_new_slave, $orig_master_host,
$orig_master_ip, $orig_master_port, $orig_master_user,
$orig_master_password, $orig_master_ssh_user, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password, $new_master_ssh_user,
);
GetOptions(
'command=s' => \$command,
'orig_master_is_new_slave' => \$orig_master_is_new_slave,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'orig_master_user=s' => \$orig_master_user,
'orig_master_password=s' => \$orig_master_password,
'orig_master_ssh_user=s' => \$orig_master_ssh_user,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
'new_master_ssh_user=s' => \$new_master_ssh_user,
);
exit &main();
sub main {
#print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "\n\n\n***************************************************************\n";
print "Disabling the VIP - $vip on old master: $orig_master_host\n";
print "***************************************************************\n\n\n\n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
print "\n\n\n***************************************************************\n";
print "Enabling the VIP - $vip on new master: $new_master_host \n";
print "***************************************************************\n\n\n\n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}
}
# A simple system call that enable the VIP on the new master
sub start_vip() {
`ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
"Usage: master_ip_failover –command=start|stop|stopssh|status –orig_master_host=host –orig_master_ip=ip –orig_master_port=po
rt –new_master_host=host –new_master_ip=ip –new_master_port=port\n";
}
###########send_report
#/bin/bash
source /root/.bash_profile
orig_master_host=`echo "$1" | awk -F = '{print $2}'`
new_master_host=`echo "$2" | awk -F = '{print $2}'`
new_slave_hosts=`echo "$3" | awk -F = '{print $2}'`
subject=`echo "$4" | awk -F = '{print $2}'`
body=`echo "$5" | awk -F = '{print $2}'`
#判斷日誌結尾是否有successfully,有則表示切換成功,成功與否都發郵件。
tac /etc/masterha/app1/manager.log | sed -n 2p | grep 'successfully' > /dev/null
if [ $? -eq 0 ]
then
echo -e "MHA $subject 主從切換成功\n master:$orig_master_host --> $new_master_host \n $body \n 當前從庫:$new_slave_hosts" | mutt
-s "MySQL實例宕掉,MHA $subject 切換成功" -- [email protected]
else
echo -e "MHA $subject 主從切換失敗\n master:$orig_master_host --> $new_master_host \n $body" | mutt -s "MySQL實例宕掉,MHA $subje
ct 切換失敗" -- [email protected]
fi
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.