實驗MHA
現在有4臺機器
node1 ip:172.18.25.51 主
node2 ip: 172.18.25.52 從
node3 ip: 172.18.25.53 從
node96 ip: 172.18.1.1 manager
首先進行時間同步
ntpdate
然後使用一樣的/etc/hosts文件
for i in 1 2 3 ;do scp /etc/hosts root@node$i:/etc ;done
[ root@node3 ~/.ssh ]# scp /etc/hosts ygl:/etc/
[ root@node2 ~ ]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.18.25.51 node1
172.18.25.52 node2
172.18.25.53 node3
172.18.1.1 ygl
然後接下來的實驗要求這幾臺機器都能通過ssh免密登錄
[ root@node1 ~ ]# ssh-keygen -t rsa -P ''
[ root@node1 ~ ]# cd .ssh/
[ root@node1 ~/.ssh ]# ssh-copy-id -i ./id_rsa.pub node1
[ root@node1 ~/.ssh ]# scp authorized_keys id_rsa id_rsa.pub root@node2:/root/.ssh/
[ root@node1 ~/.ssh ]# scp authorized_keys id_rsa id_rsa.pub root@node3:/root/.ssh/
[ root@node1 ~/.ssh ]# scp authorized_keys id_rsa id_rsa.pub root@ygl:/root/.ssh/
前期配置完成
修改各主和從數據庫的配置文件
node1
[ root@node1 ~ ]# vim /etc/my.cnf.d/server.cnf
[server]
server_id = 1
relay_log = relay-log
log_bin = master-log
skip_name_resolve = ON
innodb_file_per_table = ON
max_connections = 2000
node2
[ root@node2 ~ ]# vim /etc/my.cnf.d/server.cnf
[server]
server_id = 2
relay_log = relay-log
log_bin = master-log
relay_log_purge = OFF
read_only = ON
skip_name_resolve = ON
innodb_file_per_table = ON
max_connections = 2000
node3
[ root@node3 ~ ]# vim /etc/my.cnf.d/server.cnf
[server]
server_id = 3
relay_log = relay-log
log_bin = master-log
relay_log_purge = OFF
read_only = ON
skip_name_resolve = ON
innodb_file_per_table = ON
max_connections = 2000
然後
在node1上面做授權,讓從節點能複製。
待會再從節點上也要授權,因爲從也可能會成爲主。
node1
[ root@node1 ~ ]# systemctl start mariadb.service
MariaDB [(none)]> show master status;
+-------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+-------------------+----------+--------------+------------------+
| master-log.000003 | 245 | | |
+-------------------+----------+--------------+------------------+
MariaDB [(none)]> grant replication slave,replication client on *.* to 'repluser'@'172.18.25.%' identified by 'replpass';
MariaDB [(none)]> flush privileges;
manager是以客戶端的身份連接登陸,並擁有管理權限。
MariaDB [(none)]> grant all on *.* to 'mhaadmin'@'172.18.%.%' identified by 'mhapass';
MariaDB [(none)]> flush privileges;
node2
MariaDB [(none)]> change master to master_host='172.18.25.51',master_user='repluser',master_password='replpass',master_log_file='master-log.000003',master_log_pos=245;
MariaDB [(none)]> start slave;
node3
MariaDB [(none)]> change master to master_host='172.18.25.51',master_user='repluser',master_password='replpass',master_log_file='master-log.000003',master_log_pos=245;
MariaDB [(none)]> start slave;
接下來我們就要配置MHA了
首先在manager上面安裝,因爲是manager所以都要裝
mha4mysql-manager-0.56-0.el6.noarch.rpm
mha4mysql-node-0.56-0.el6.noarch.rpm
[ root@ygl ~ ]# yum install ./*.rpm
安裝他們
並且把mha4mysql-node包發給node1 node2 node3
[ root@ygl ~ ]# for i in {1..3};do scp mha4mysql-node-0.56-0.el6.noarch.rpm node$i:/root/ ;done
[ root@node1 ~ ]# yum install mha4mysql-node-0.56-0.el6.noarch.rpm -y
[ root@node2 ~ ]# yum install mha4mysql-node-0.56-0.el6.noarch.rpm -y
[ root@node3 ~ ]# yum install mha4mysql-node-0.56-0.el6.noarch.rpm -y
安裝完成
接下來要初始化mha了
按照約定熟成的慣例我們都是把配置文件放在/etc/masterha/下的,我們先創建目錄
並配置MHA的配置文件
[ root@ygl ~ ]# mkdir /etc/masterha/
[ root@ygl ~ ]# vim /etc/masterha/app1.cnf
[server default]
user=mhaadmin
password=mhapass
manager_workdir=/data/masterha/app1
manager_log=/data/masterha/app1/manager.log
remote_workdir=/data/masterha/app1
ssh_user=root
ssh_port=22
repl_user=repluser
repl_password=replpass
ping_interval=1
[server1]
hostname=172.18.25.51
candidate_master=1
[server2]
hostname=172.18.25.52
candidate_master=1
[server3]
hostname=172.18.25.53
candidate_master=1
然後做ssh檢測
[ root@ygl ~ ]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
如果最後得到ALL SSH connection tests passed successfully.就是成功了
Thu Nov 23 16:37:14 2017 - [info] All SSH connection tests passed successfully.
再檢查mysql主從複製的集羣是否ok
[ root@ygl ~ ]# masterha_check_repl --conf=/etc/masterha/app1.cnf
最後得到這句話就是成功了
MySQL Replication Health is OK.
環境搭配成功後
就可以啓動mha了
[ root@ygl ~ ]# masterha_manager --conf=/etc/masterha/app1.cnf
成功啓動後
我們模擬主節點down了
馬上可以從manager上面看到
[ root@node1 ~ ]# systemctl stop mariadb
[ root@ygl ~ ]# masterha_manager --conf=/etc/masterha/app1.cnf
Thu Nov 23 16:53:06 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Nov 23 16:53:06 2017 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Nov 23 16:53:06 2017 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Creating /data/masterha/app1 if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /var/lib/mysql, up to master-log.000003
Thu Nov 23 16:54:12 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Nov 23 16:54:12 2017 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Nov 23 16:54:12 2017 - [info] Reading server configuration from /etc/masterha/app1.cnf..
現在去node2上面查看會發現node2已經成爲了主節點了
node3上面的主已經指向了node2了
並且在manager上在使用之前的檢查複製集羣的的狀態的方法
可以查看到
[ root@ygl ~ ]# masterha_manager --conf=/etc/masterha/app1.cnf
Thu Nov 23 16:59:10 2017 - [info] Dead Servers:
Thu Nov 23 16:59:10 2017 - [info] 172.18.25.51(172.18.25.51:3306)
Thu Nov 23 16:59:10 2017 - [info] Alive Servers:
Thu Nov 23 16:59:10 2017 - [info] 172.18.25.52(172.18.25.52:3306)
Thu Nov 23 16:59:10 2017 - [info] 172.18.25.53(172.18.25.53:3306)
有這麼一段信息,指明node1已經down了
然後現在我們在修復node1後又要上線了,那麼
node1只能成爲node2的從
在node1上
[ root@node1 ~ ]# vim /etc/my.cnf.d/server.cnf
[server]
server_id = 1
relay_log = relay-log
log_bin = master-log
relay_log_purge = OFF
read_only = ON
skip_name_resolve = ON
innodb_file_per_table = ON
max_connections = 2000
把他修改成從節點
然後啓動服務
並連接至本機
如果我們的服務器是半道掛了,
那麼我們應該對主節點做一次備份
然後恢復,再指向指定節點開始複製
這裏我直接指向節點複製了
node2上查詢日誌節點
MariaDB [(none)]> show master status;
+-------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+-------------------+----------+--------------+------------------+
| master-log.000003 | 470 | | |
+-------------------+----------+--------------+------------------
node1
MariaDB [(none)]> change master to master_host='172.18.25.52',master_user='repluser',master_password='replpass',master_log_file='master-log.000003',master_log_pos=470;
MariaDB [(none)]> start slave;
MariaDB [(none)]> flush privileges;
然後我們再去檢測集羣
會得到
MySQL Replication Health is OK.
然後我們在啓動manager就好了
接下來他會重新進行監控
[ root@ygl ~ ]# masterha_manager --conf=/etc/masterha/app1.cnf
Thu Nov 23 17:11:32 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Nov 23 17:11:32 2017 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Nov 23 17:11:32 2017 - [info] Reading server configuration from /etc/masterha/app1.cnf..
爲了不影響,我們放在後臺
[ root@ygl ~ ]# mnohup masterha_manager --conf=/etc/masterha/app1.cnf &> /data/masterha/app1/manager.log &
進一步工作
前面三個步驟已經配置了一個基本的 MHA 環境。不過,爲了更多實際應用的需求,還需要
進一步完成如下操作。
(1) 提供額外檢測機制,以免對 master 的監控做出誤判;
(2) 在 master 節點上提供虛擬 ip 地址向外提供服務,以名 master 節點轉換時,客戶端的請
求無法正確送達;
(3) 進行故障轉移時對原有 master 節點執行 STONITH 操作以避免腦裂;可通過指定
shutdon_script 實現;
(4) 必要時,進行在線 master 節點轉換;