Mysql5.5部署MHA

MHA分管理節點和數據庫節點,數據庫節點由mysql主從或者主主從構成,當主庫掛掉後,管理節點會自動將從節點提升爲主節點;管理節點的角色類似於oracle數據庫中的fast start failover中的observer,但mha上層可以通過keepalive部署VIP,程序連接數據庫使用VIP,從而實現後臺數據庫的故障切換透明化

MHA節點包含三個腳本,依賴perl模塊;
save_binary_logs:保存和複製當掉的主服務器二進制日誌;
apply_diff_relay_logs:識別差異的relay log事件,並應用於其他salve服務器;
purge_relay_logs:清除relay log文件;
需要在所有mysql服務器上安裝MHA節點,MHA管理服務器也需要安裝。MHA管理節點模塊內部依賴MHA節點模塊;
MHA管理節點通過ssh連接管理mysql服務器和執行MHA節點腳本。MHA節點依賴perl的DBD::mysql模塊;

本文環境介紹,操作系統均爲rhel5.4
主庫:192.168.123.13/dg53.yang.com
備庫:192.168.123.14/dg54.yang.com
管理節點:192.168.123.15/dg55.yang.com

在開始之前,請先配置好服務器間的時間同步和名稱解析

一:在數據庫節點安裝mha node

  1. [root@dg53 ~]# rpm -ivh http://dl.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm  
  2. Retrieving http://dl.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm  
  3. warning: /var/tmp/rpm-xfer.yqwfYT: Header V3 DSA signature: NOKEY, key ID 217521f6  
  4. Preparing...                ########################################### [100%]  
  5.    1:epel-release           ########################################### [100%]  
  6.  
  7. [root@dg53 ~]# ls /etc/yum.repos.d/  
  8. base.repo  epel.repo  epel-testing.repo  rhel-debuginfo.repo  
  9. [root@dg53 ~]# yum -y install perl-DBD-MySQL  ncftp  
  10. [root@dg53 ~]#  wget http://mysql-master-ha.googlecode.com/files/mha4mysql-node-0.52.tar.gz -P /usr/local/src/tarbag/  
  11. [root@dg53 ~]# cd /usr/local/src/tarbag/  
  12. [root@dg53 tarbag]# tar -zxvpf mha4mysql-node-0.52.tar.gz -C ../software/  
  13. [root@dg53 tarbag]# cd ../software/mha4mysql-node-0.52/  
  14. [root@dg53 mha4mysql-node-0.52]# perl Makefile.PL   
  15. [root@dg53 mha4mysql-node-0.52]# make && make install 

二:管理節點
1:按照步驟一安裝mha node

2:安裝mha manager

  1. [root@dg55 ~]# yum -y install perl-Config-Tiny perl-Params-Validate perl-Log-Dispatch perl-Parallel-ForkManager  
  2. [root@dg55 ~]# wget http://mysql-master-ha.googlecode.com/files/mha4mysql-manager-0.52.tar.gz -P /usr/local/src/tarbag/  
  3. [root@dg55 ~]# cd /usr/local/src/tarbag/  
  4. [root@dg55 tarbag]# tar -zxvpf mha4mysql-manager-0.52.tar.gz -C ../software/  
  5. [root@dg55 tarbag]# cd ../software/mha4mysql-manager-0.52/  
  6. [root@dg55 mha4mysql-manager-0.52]# perl Makefile.PL  
  7. [root@dg55 mha4mysql-manager-0.52]# make && make install 

3:編輯配置文件

  1. [root@dg55 mha4mysql-manager-0.52]# mkdir /etc/masterha  
  2. [root@dg55 mha4mysql-manager-0.52]# mkdir -p /masterha/app1  
  3. [root@dg55 mha4mysql-manager-0.52]# cp samples/conf/* /etc/masterha/  
  4. [root@dg55 mha4mysql-manager-0.52]# cat /etc/masterha/app1.cnf   
  5. [server default]  
  6. manager_workdir=/masterha/app1  
  7. manager_log=/masterha/app1/manager.log  
  8. user=root 
  9. password=123456 
  10. ssh_user=root 
  11. repl_user=r_test 
  12. repl_password=123456 
  13. ping_interval=1 
  14. shutdown_script="" 
  15. #master_ip_failover_script="/usr/local/bin/master_ip_failover" 
  16. master_ip_online_change_script="" 
  17. report_script="" 
  18. [server1]  
  19. hostname=192.168.123.13  
  20. master_binlog_dir="/mydata" 
  21. candidate_master=1 
  22. [server2]  
  23. hostname=192.168.123.14  
  24. master_binlog_dir="/mydata" 
  25. candidate_master=1 

4:配置manager節點和node節點以及node節點間的ssh公鑰信任

  1. [root@dg55 ~]# ssh-keygen -t rsa  
  2. [root@dg55 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]  
  3. [root@dg55 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]  
  4.  
  5. [root@dg55 ~]# ssh 192.168.123.13 "ifconfig |grep 'inet addr' |head -1"  
  6.           inet addr:192.168.123.13  Bcast:192.168.123.255  Mask:255.255.255.0  
  7. [root@dg55 ~]# ssh 192.168.123.14 "ifconfig |grep 'inet addr' |head -1"  
  8.           inet addr:192.168.123.14  Bcast:192.168.123.255  Mask:255.255.255.0 

5:測試ssh連接

  1. [root@dg55 ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf   
  2. Wed Jun  6 11:11:25 2012 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.  
  3. Wed Jun  6 11:11:25 2012 - [info] Reading application default configurations from /etc/masterha/app1.cnf..  
  4. Wed Jun  6 11:11:25 2012 - [info] Reading server configurations from /etc/masterha/app1.cnf..  
  5. Wed Jun  6 11:11:25 2012 - [info] Starting SSH connection tests..  
  6. Wed Jun  6 11:11:25 2012 - [debug]   
  7. Wed Jun  6 11:11:25 2012 - [debug]  Connecting via SSH from [email protected](192.168.123.13) to [email protected](192.168.123.14)..  
  8. Wed Jun  6 11:11:25 2012 - [debug]   ok.  
  9. Wed Jun  6 11:11:26 2012 - [debug]   
  10. Wed Jun  6 11:11:25 2012 - [debug]  Connecting via SSH from [email protected](192.168.123.14) to [email protected](192.168.123.13)..  
  11. Wed Jun  6 11:11:26 2012 - [debug]   ok.  
  12. Wed Jun  6 11:11:26 2012 - [info] All SSH connection tests passed successfully. 

6:測試主從複製情況,默認使用root用戶連接

  1. mysql> select user,host,password from mysql.user;  
  2. +--------+----------------+-------------------------------------------+  
  3. | user   | host           | password                                  |  
  4. +--------+----------------+-------------------------------------------+  
  5. | root   | localhost      |                                           |   
  6. | root   | dg53.yang.com  |                                           |   
  7. | root   | 127.0.0.1      |                                           |   
  8. | root   | ::1            |                                           |   
  9. |        | localhost      |                                           |   
  10. |        | dg53.yang.com  |                                           |   
  11. | r_test | 192.168.123.14 | *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9 |   
  12. +--------+----------------+-------------------------------------------+  
  13. 7 rows in set (0.08 sec)  
  14.  
  15. mysql> use mysql;  
  16. Database changed  
  17.  
  18. mysql> update user set host='192.168.123.%' where user='r_test';  
  19. Query OK, 1 row affected (0.05 sec)  
  20. Rows matched: 1  Changed: 1  Warnings: 0  
  21.  
  22. mysql> commit;  
  23. Query OK, 0 rows affected (0.01 sec)  
  24.  
  25. mysql> update user set host='192.168.123.%' where host='localhost' and user='root' and password='';  
  26. Query OK, 0 rows affected (0.00 sec)  
  27. Rows matched: 1  Changed: 1  Warnings: 0  
  28.  
  29. mysql> update user set password=PASSWORD('123456') where user='root' and host='192.168.123.%';  
  30. Query OK, 0 rows affected (0.00 sec)  
  31. Rows matched: 1  Changed: 0  Warnings: 0  
  32.  
  33. mysql> flush privileges;  
  34. Query OK, 0 rows affected (0.02 sec)  
  35.  
  36. [root@dg53 ~]# whereis mysqlbinlog  
  37. mysqlbinlog: /usr/bin/mysqlbinlog  
  38.  
  39. [root@dg53 ~]# mv /usr/bin/mysql* /tmp  
  40. [root@dg54 ~]# mv /usr/bin/mysql* /tmp  
  41. [root@dg53 ~]# ln -s /usr/local/mysql5.5.25/bin/* /usr/local/bin/  
  42. [root@dg54 ~]# ln -s /usr/local/mysql5.5.25/bin/* /usr/local/bin/  
  43.  
  44. [root@dg55 ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf   
  45. Wed Jun  6 12:39:03 2012 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.  
  46. Wed Jun  6 12:39:03 2012 - [info] Reading application default configurations from /etc/masterha/app1.cnf..  
  47. Wed Jun  6 12:39:03 2012 - [info] Reading server configurations from /etc/masterha/app1.cnf..  
  48. Wed Jun  6 12:39:03 2012 - [info] MHA::MasterMonitor version 0.52.  
  49. Wed Jun  6 12:39:03 2012 - [info] Dead Servers:  
  50. Wed Jun  6 12:39:03 2012 - [info] Alive Servers:  
  51. Wed Jun  6 12:39:03 2012 - [info]   192.168.123.13(192.168.123.13:3306)  
  52. Wed Jun  6 12:39:03 2012 - [info]   192.168.123.14(192.168.123.14:3306)  
  53. Wed Jun  6 12:39:03 2012 - [info] Alive Slaves:  
  54. Wed Jun  6 12:39:03 2012 - [info]   192.168.123.14(192.168.123.14:3306)  Version=5.5.25-log (oldest major version between slaves) log-bin:enabled  
  55. Wed Jun  6 12:39:03 2012 - [info]     Replicating from 192.168.123.13(192.168.123.13:3306)  
  56. Wed Jun  6 12:39:03 2012 - [info]     Primary candidate for the new Master (candidate_master is set)  
  57. Wed Jun  6 12:39:03 2012 - [info] Current Alive Master: 192.168.123.13(192.168.123.13:3306)  
  58. Wed Jun  6 12:39:03 2012 - [info] Checking slave configurations..  
  59. Wed Jun  6 12:39:03 2012 - [warning]  read_only=1 is not set on slave 192.168.123.14(192.168.123.14:3306).  
  60. Wed Jun  6 12:39:03 2012 - [warning]  relay_log_purge=0 is not set on slave 192.168.123.14(192.168.123.14:3306).  
  61. Wed Jun  6 12:39:03 2012 - [info] Checking replication filtering settings..  
  62. Wed Jun  6 12:39:03 2012 - [info]  binlog_do_dbbbs,test, binlog_ignore_dbmysql 
  63. Wed Jun  6 12:39:03 2012 - [info]  Replication filtering check ok.  
  64. Wed Jun  6 12:39:03 2012 - [info] Starting SSH connection tests..  
  65. Wed Jun  6 12:39:05 2012 - [info] All SSH connection tests passed successfully.  
  66. Wed Jun  6 12:39:05 2012 - [info] Checking MHA Node version..  
  67. Wed Jun  6 12:39:05 2012 - [info]  Version check ok.  
  68. Wed Jun  6 12:39:05 2012 - [info] Checking SSH publickey authentication and checking recovery script configurations on the current master..  
  69. Wed Jun  6 12:39:05 2012 - [info]   Executing command: save_binary_logs --command=test --start_file=mysql-bin.000011 --start_pos=4 --binlog_dir=/mydata --output_file=/var/tmp/save_binary_logs_test --manager_version=0.52   
  70. Wed Jun  6 12:39:05 2012 - [info]   Connecting to [email protected](192.168.123.13)..   
  71.   Creating /var/tmp if not exists..    ok.  
  72.   Checking output directory is accessible or not..  
  73.    ok.  
  74.   Binlog found at /mydata, up to mysql-bin.000011  
  75. Wed Jun  6 12:39:06 2012 - [info] Master setting check done.  
  76. Wed Jun  6 12:39:06 2012 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..  
  77. Wed Jun  6 12:39:06 2012 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user=root --slave_host=192.168.123.14 --slave_ip=192.168.123.14 --slave_port=3306 --workdir=/var/tmp --target_version=5.5.25-log --manager_version=0.52 --relay_log_info=/mydata/relay-log.info  --slave_pass=xxx 
  78. Wed Jun  6 12:39:06 2012 - [info]   Connecting to [email protected](192.168.123.14)..   
  79.   Checking slave recovery environment settings..  
  80.     Opening /mydata/relay-log.info ... ok.  
  81.     Relay log found at /mydata, up to dg54-relay-bin.000019  
  82.     Temporary relay log file is /mydata/dg54-relay-bin.000019  
  83.     Testing mysql connection and privileges.. done.  
  84.     Testing mysqlbinlog output.. done.  
  85.     Cleaning up test file(s).. done.  
  86. Wed Jun  6 12:39:06 2012 - [info] Slaves settings check done.  
  87. Wed Jun  6 12:39:06 2012 - [info]   
  88. 192.168.123.13 (current master)  
  89.  +--192.168.123.14  
  90.  
  91. Wed Jun  6 12:39:06 2012 - [info] Checking replication health on 192.168.123.14..  
  92. Wed Jun  6 12:39:06 2012 - [info]  ok.  
  93. Wed Jun  6 12:39:06 2012 - [warning] master_ip_failover_script is not defined.  
  94. Wed Jun  6 12:39:06 2012 - [warning] shutdown_script is not defined.  
  95. Wed Jun  6 12:39:06 2012 - [info] Got exit code 0 (Not master dead).  
  96.  
  97. MySQL Replication Health is OK. 

7:啓動管理節點進程

  1. [root@dg55 ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf > /tmp/mha_manager.log  < /dev/null 2>&1 &  
  2. [1] 25516  
  3. [root@dg55 ~]# masterha_check_status --conf=/etc/masterha/app1.cnf   
  4. app1 (pid:25516) is running(0:PING_OK), master:192.168.123.13 

三:測試failover過程
1:當前主庫爲192.168.123.13,關閉主庫

  1. mysql> show slave hosts;  
  2. +-----------+------+------+-----------+  
  3. | Server_id | Host | Port | Master_id |  
  4. +-----------+------+------+-----------+  
  5. |         2 |      | 3306 |         1 |  
  6. +-----------+------+------+-----------+  
  7. 1 row in set (0.00 sec)  
  8.  
  9. [root@dg53 ~]# service mysqld stop  
  10. Shutting down MySQL...[  OK  ] 

2:在管理節點上觀察日誌輸出

[root@dg55 ~]# tail -f /masterha/app1/manager.log
Wed Jun  6 14:50:48 2012 - [info]
192.168.123.13 (current master)
 +--192.168.123.14

Wed Jun  6 14:50:48 2012 - [warning] master_ip_failover_script is not defined.
Wed Jun  6 14:50:48 2012 - [warning] shutdown_script is not defined.
Wed Jun  6 14:50:48 2012 - [info] Set master ping interval 1 seconds.
Wed Jun  6 14:50:48 2012 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Wed Jun  6 14:50:48 2012 - [info] Starting ping health check on 192.168.123.13(192.168.123.13:3306)..
Wed Jun  6 14:50:48 2012 - [info] Ping succeeded, sleeping until it doesn't respond..

Wed Jun  6 14:51:32 2012 - [warning] Got error on MySQL ping: 2006 (MySQL server has gone away)
Wed Jun  6 14:51:32 2012 - [info] HealthCheck: SSH to 192.168.123.13 is reachable.
Wed Jun  6 14:51:33 2012 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Wed Jun  6 14:51:33 2012 - [warning] Connection failed 1 time(s)..
Wed Jun  6 14:51:34 2012 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Wed Jun  6 14:51:34 2012 - [warning] Connection failed 2 time(s)..
Wed Jun  6 14:51:35 2012 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 111)
Wed Jun  6 14:51:35 2012 - [warning] Connection failed 3 time(s)..
Wed Jun  6 14:51:35 2012 - [warning] Master is not reachable from health checker!
Wed Jun  6 14:51:35 2012 - [warning] Master 192.168.123.13(192.168.123.13:3306) is not reachable!
Wed Jun  6 14:51:35 2012 - [warning] SSH is reachable.
Wed Jun  6 14:51:35 2012 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status..
Wed Jun  6 14:51:35 2012 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Jun  6 14:51:35 2012 - [info] Reading application default configurations from /etc/masterha/app1.cnf..
Wed Jun  6 14:51:35 2012 - [info] Reading server configurations from /etc/masterha/app1.cnf..
Wed Jun  6 14:51:35 2012 - [info] Dead Servers:
Wed Jun  6 14:51:35 2012 - [info]   192.168.123.13(192.168.123.13:3306)
Wed Jun  6 14:51:35 2012 - [info] Alive Servers:
Wed Jun  6 14:51:35 2012 - [info]   192.168.123.14(192.168.123.14:3306)
Wed Jun  6 14:51:35 2012 - [info] Alive Slaves:
Wed Jun  6 14:51:35 2012 - [info]   192.168.123.14(192.168.123.14:3306)  Version=5.5.25-log (oldest major version between slaves) log-bin:enabled
Wed Jun  6 14:51:35 2012 - [info]     Replicating from 192.168.123.13(192.168.123.13:3306)
Wed Jun  6 14:51:35 2012 - [info]     Primary candidate for the new Master (candidate_master is set)
Wed Jun  6 14:51:35 2012 - [info] Checking slave configurations..
Wed Jun  6 14:51:35 2012 - [warning]  read_only=1 is not set on slave 192.168.123.14(192.168.123.14:3306).
Wed Jun  6 14:51:35 2012 - [warning]  relay_log_purge=0 is not set on slave 192.168.123.14(192.168.123.14:3306).
Wed Jun  6 14:51:35 2012 - [info] Checking replication filtering settings..
Wed Jun  6 14:51:35 2012 - [info]  Replication filtering check ok.
Wed Jun  6 14:51:35 2012 - [info] Master is down!
Wed Jun  6 14:51:35 2012 - [info] Terminating monitoring script.
Wed Jun  6 14:51:35 2012 - [info] Got exit code 20 (Master dead).
Wed Jun  6 14:51:35 2012 - [info] MHA::MasterFailover version 0.52.
Wed Jun  6 14:51:35 2012 - [info] Starting master failover.
Wed Jun  6 14:51:35 2012 - [info]
Wed Jun  6 14:51:35 2012 - [info] * Phase 1: Configuration Check Phase..
Wed Jun  6 14:51:35 2012 - [info]
Wed Jun  6 14:51:35 2012 - [info] Dead Servers:
Wed Jun  6 14:51:35 2012 - [info]   192.168.123.13(192.168.123.13:3306)
Wed Jun  6 14:51:35 2012 - [info] Checking master reachability via mysql(double check)..
Wed Jun  6 14:51:35 2012 - [info]  ok.
Wed Jun  6 14:51:35 2012 - [info] Alive Servers:
Wed Jun  6 14:51:35 2012 - [info]   192.168.123.14(192.168.123.14:3306)
Wed Jun  6 14:51:35 2012 - [info] Alive Slaves:
Wed Jun  6 14:51:35 2012 - [info]   192.168.123.14(192.168.123.14:3306)  Version=5.5.25-log (oldest major version between slaves) log-bin:enabled
Wed Jun  6 14:51:35 2012 - [info]     Replicating from 192.168.123.13(192.168.123.13:3306)
Wed Jun  6 14:51:35 2012 - [info]     Primary candidate for the new Master (candidate_master is set)
Wed Jun  6 14:51:35 2012 - [info] ** Phase 1: Configuration Check Phase completed.
Wed Jun  6 14:51:35 2012 - [info]
Wed Jun  6 14:51:35 2012 - [info] * Phase 2: Dead Master Shutdown Phase..
Wed Jun  6 14:51:35 2012 - [info]
Wed Jun  6 14:51:35 2012 - [info] Forcing shutdown so that applications never connect to the current master..
Wed Jun  6 14:51:35 2012 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master ip address.
Wed Jun  6 14:51:35 2012 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Wed Jun  6 14:51:35 2012 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Wed Jun  6 14:51:35 2012 - [info]
Wed Jun  6 14:51:35 2012 - [info] * Phase 3: Master Recovery Phase..
Wed Jun  6 14:51:35 2012 - [info]
Wed Jun  6 14:51:35 2012 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Wed Jun  6 14:51:35 2012 - [info]
Wed Jun  6 14:51:35 2012 - [info] The latest binary log file/position on all slaves is mysql-bin.000021:107
Wed Jun  6 14:51:35 2012 - [info] Latest slaves (Slaves that received relay log files to the latest):
Wed Jun  6 14:51:35 2012 - [info]   192.168.123.14(192.168.123.14:3306)  Version=5.5.25-log (oldest major version between slaves) log-bin:enabled
Wed Jun  6 14:51:35 2012 - [info]     Replicating from 192.168.123.13(192.168.123.13:3306)
Wed Jun  6 14:51:35 2012 - [info]     Primary candidate for the new Master (candidate_master is set)
Wed Jun  6 14:51:35 2012 - [info] The oldest binary log file/position on all slaves is mysql-bin.000021:107
Wed Jun  6 14:51:35 2012 - [info] Oldest slaves:
Wed Jun  6 14:51:35 2012 - [info]   192.168.123.14(192.168.123.14:3306)  Version=5.5.25-log (oldest major version between slaves) log-bin:enabled
Wed Jun  6 14:51:35 2012 - [info]     Replicating from 192.168.123.13(192.168.123.13:3306)
Wed Jun  6 14:51:35 2012 - [info]     Primary candidate for the new Master (candidate_master is set)
Wed Jun  6 14:51:35 2012 - [info]
Wed Jun  6 14:51:35 2012 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
Wed Jun  6 14:51:35 2012 - [info]
Wed Jun  6 14:51:35 2012 - [info] Fetching dead master's binary logs..
Wed Jun  6 14:51:35 2012 - [info] Executing command on the dead master 192.168.123.13(192.168.123.13:3306): save_binary_logs --command=save --start_file=mysql-bin.000021  --start_pos=107 --binlog_dir=/mydata --output_file=/var/tmp/saved_master_binlog_from_192.168.123.13_3306_20120606145135.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.52
  Creating /var/tmp if not exists..    ok.
 Concat binary/relay logs from mysql-bin.000021 pos 107 to mysql-bin.000021 EOF into /var/tmp/saved_master_binlog_from_192.168.123.13_3306_20120606145135.binlog ..
  Dumping binlog format description event, from position 0 to 107.. ok.
  Dumping effective binlog data from /mydata/mysql-bin.000021 position 107 to tail(126).. ok.
 Concat succeeded.
Wed Jun  6 14:51:36 2012 - [info] scp from [email protected]:/var/tmp/saved_master_binlog_from_192.168.123.13_3306_20120606145135.binlog to local:/masterha/app1/saved_master_binlog_from_192.168.123.13_3306_20120606145135.binlog succeeded.
Wed Jun  6 14:51:36 2012 - [info] HealthCheck: SSH to 192.168.123.14 is reachable.
Wed Jun  6 14:51:37 2012 - [info]
Wed Jun  6 14:51:37 2012 - [info] * Phase 3.3: Determining New Master Phase..
Wed Jun  6 14:51:37 2012 - [info]
Wed Jun  6 14:51:37 2012 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
Wed Jun  6 14:51:37 2012 - [info] All slaves received relay logs to the same position. No need to resync each other.
Wed Jun  6 14:51:37 2012 - [info] Searching new master from slaves..
Wed Jun  6 14:51:37 2012 - [info]  Candidate masters from the configuration file:
Wed Jun  6 14:51:37 2012 - [info]   192.168.123.14(192.168.123.14:3306)  Version=5.5.25-log (oldest major version between slaves) log-bin:enabled
Wed Jun  6 14:51:37 2012 - [info]     Replicating from 192.168.123.13(192.168.123.13:3306)
Wed Jun  6 14:51:37 2012 - [info]     Primary candidate for the new Master (candidate_master is set)
Wed Jun  6 14:51:37 2012 - [info]  Non-candidate masters:
Wed Jun  6 14:51:37 2012 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
Wed Jun  6 14:51:37 2012 - [info] New master is 192.168.123.14(192.168.123.14:3306)
Wed Jun  6 14:51:37 2012 - [info] Starting master failover..
Wed Jun  6 14:51:37 2012 - [info]
From:
192.168.123.13 (current master)
 +--192.168.123.14

To:
192.168.123.14 (new master)
Wed Jun  6 14:51:37 2012 - [info]
Wed Jun  6 14:51:37 2012 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
Wed Jun  6 14:51:37 2012 - [info]
Wed Jun  6 14:51:37 2012 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
Wed Jun  6 14:51:37 2012 - [info] Sending binlog..
Wed Jun  6 14:51:37 2012 - [info] scp from local:/masterha/app1/saved_master_binlog_from_192.168.123.13_3306_20120606145135.binlog to [email protected]:/var/tmp/saved_master_binlog_from_192.168.123.13_3306_20120606145135.binlog succeeded.
Wed Jun  6 14:51:37 2012 - [info]
Wed Jun  6 14:51:37 2012 - [info] * Phase 3.4: Master Log Apply Phase..
Wed Jun  6 14:51:37 2012 - [info]
Wed Jun  6 14:51:37 2012 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
Wed Jun  6 14:51:37 2012 - [info] Starting recovery on 192.168.123.14(192.168.123.14:3306)..
Wed Jun  6 14:51:37 2012 - [info]  Generating diffs succeeded.
Wed Jun  6 14:51:37 2012 - [info] Waiting until all relay logs are applied.
Wed Jun  6 14:51:37 2012 - [info]  done.
Wed Jun  6 14:51:37 2012 - [info] Getting slave status..
Wed Jun  6 14:51:37 2012 - [info] This slave(192.168.123.14)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(mysql-bin.000021:107). No need to recover from Exec_Master_Log_Pos.
Wed Jun  6 14:51:37 2012 - [info] Connecting to the target slave host 192.168.123.14, running recover script..
Wed Jun  6 14:51:37 2012 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user=root --slave_host=192.168.123.14 --slave_ip=192.168.123.14  --slave_port=3306 --apply_files=/var/tmp/saved_master_binlog_from_192.168.123.13_3306_20120606145135.binlog --workdir=/var/tmp --target_version=5.5.25-log --timestamp=20120606145135 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.52 --slave_pass=xxx
Wed Jun  6 14:51:37 2012 - [info]
Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.123.13_3306_20120606145135.binlog on 192.168.123.14:3306. This may take long time...
Applying log files succeeded.
Wed Jun  6 14:51:37 2012 - [info]  All relay logs were successfully applied.
Wed Jun  6 14:51:37 2012 - [info] Getting new master's binlog name and position..
Wed Jun  6 14:51:37 2012 - [info]  mysql-bin.000023:107
Wed Jun  6 14:51:37 2012 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.123.14', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000023', MASTER_LOG_POS=107, MASTER_USER='r_test', MASTER_PASSWORD='xxx';
Wed Jun  6 14:51:37 2012 - [warning] master_ip_failover_script is not set. Skipping taking over new master ip address.
Wed Jun  6 14:51:37 2012 - [info] ** Finished master recovery successfully.
Wed Jun  6 14:51:37 2012 - [info] * Phase 3: Master Recovery Phase completed.
Wed Jun  6 14:51:37 2012 - [info]
Wed Jun  6 14:51:37 2012 - [info] * Phase 4: Slaves Recovery Phase..
Wed Jun  6 14:51:37 2012 - [info]
Wed Jun  6 14:51:37 2012 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
Wed Jun  6 14:51:37 2012 - [info]
Wed Jun  6 14:51:37 2012 - [info] Generating relay diff files from the latest slave succeeded.
Wed Jun  6 14:51:37 2012 - [info]
Wed Jun  6 14:51:37 2012 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
Wed Jun  6 14:51:37 2012 - [info]
Wed Jun  6 14:51:37 2012 - [info] All new slave servers recovered successfully.
Wed Jun  6 14:51:37 2012 - [info]
Wed Jun  6 14:51:37 2012 - [info] * Phase 5: New master cleanup phease..
Wed Jun  6 14:51:37 2012 - [info]
Wed Jun  6 14:51:37 2012 - [info] Resetting slave info on the new master..
Wed Jun  6 14:51:37 2012 - [info] Master failover to 192.168.123.14(192.168.123.14:3306) completed successfully.
Wed Jun  6 14:51:37 2012 - [info]

----- Failover Report -----

app1: MySQL Master failover 192.168.123.13 to 192.168.123.14 succeeded

Master 192.168.123.13 is down!

Check MHA Manager logs at dg55.yang.com:/masterha/app1/manager.log for details.

Started automated(non-interactive) failover.
The latest slave 192.168.123.14(192.168.123.14:3306) has all relay logs for recovery.
Selected 192.168.123.14 as a new master.
192.168.123.14: OK: Applying all logs succeeded.
Generating relay diff files from the latest slave succeeded.
192.168.123.14: Resetting slave info succeeded.
Master failover to 192.168.123.14(192.168.123.14:3306) completed successfully.

3:在原從庫192.168.123.14上查看結果

  1. mysql> show master status;  
  2. +------------------+----------+--------------+------------------+  
  3. | File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |  
  4. +------------------+----------+--------------+------------------+  
  5. | mysql-bin.000023 |      107 | bbs,test     | mysql            |  
  6. +------------------+----------+--------------+------------------+  
  7. 1 row in set (0.00 sec)  
  8.  
  9. mysql> show slave status\G;  
  10. *************************** 1. row ***************************  
  11.                Slave_IO_State:   
  12.                   Master_Host: 192.168.123.13  
  13.                   Master_User: r_test  
  14.                   Master_Port: 3306  
  15.                 Connect_Retry: 60  
  16.               Master_Log_File:   
  17.           Read_Master_Log_Pos: 4  
  18.                Relay_Log_File: dg54-relay-bin.000001  
  19.                 Relay_Log_Pos: 4  
  20.         Relay_Master_Log_File:   
  21.              Slave_IO_Running: No  
  22.             Slave_SQL_Running: No  
  23.               Replicate_Do_DB:   
  24.           Replicate_Ignore_DB:   
  25.            Replicate_Do_Table:   
  26.        Replicate_Ignore_Table:   
  27.       Replicate_Wild_Do_Table:   
  28.   Replicate_Wild_Ignore_Table:   
  29.                    Last_Errno: 0  
  30.                    Last_Error:   
  31.                  Skip_Counter: 0  
  32.           Exec_Master_Log_Pos: 0  
  33.               Relay_Log_Space: 126  
  34.               Until_Condition: None  
  35.                Until_Log_File:   
  36.                 Until_Log_Pos: 0  
  37.            Master_SSL_Allowed: No  
  38.            Master_SSL_CA_File:   
  39.            Master_SSL_CA_Path:   
  40.               Master_SSL_Cert:   
  41.             Master_SSL_Cipher:   
  42.                Master_SSL_Key:   
  43.         Seconds_Behind_Master: NULL  
  44. Master_SSL_Verify_Server_Cert: No  
  45.                 Last_IO_Errno: 0  
  46.                 Last_IO_Error:   
  47.                Last_SQL_Errno: 0  
  48.                Last_SQL_Error:   
  49.   Replicate_Ignore_Server_Ids:   
  50.              Master_Server_Id: 1  
  51. 1 row in set (0.00 sec)  
  52.  
  53. ERROR:   
  54. No query specified 

備註:本文只簡單記錄了mha的環境部署過程,更多高級的內容,例如配合keepalive使用VIP,讓客戶端連接數據庫透明化以及手動failover和master_ip_failover_script腳本,半同步複製等功能後續將繼續研究!

參考文章,感謝作者分享
http://who0168.blog.51cto.com/253401/699030
http://huoding.com/2011/12/18/139
http://www.vmcd.org/2012/04/mysql-high-availability-mha/

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章