實驗環境:
先搭建好MFS。我們利用mfs做這篇集羣的實驗。
server1 172.25.254.1 master
server2 172.25.254.2 chunk
server3 172.25.254.3 chunk
server4 172.25.254.4 備用master
1和4 做高可用結點,2和3 做數據存儲
server1和4:
yum install moosefs-master-3.0.113-1.rhsystemd.x86_64.rpm -y
server2和server3:
yum install moosefs-chunkserver-3.0.113-1.rhsystemd.x86_64.rpm -y
在server1和server4上配置啓動腳本:
vim /usr/lib/systemd/system/moosefs-master.service
systemctl daemon-reload 刷新
加上-a參數避免異常退出後服務啓動不了。
在四臺主機上配置yum源(高可用數據庫和存儲數據庫)。
配置集羣
在server1和server4上安裝集羣軟件:
yum install pacemaker corosync pcs -y
pacemaker 主軟件
corosync 同步複製和心跳檢測
pcs 命令行
[root@server1 3.0.113]# id hacluster
uid=189(hacluster) gid=189(haclient) groups=189(haclient)
# 安裝完成後自動生成hacluster用戶
1 和 4 主機之間做免密
接下來配置集羣:
[root@server4 ~]# systemctl start pcsd.service
[root@server4 ~]# systemctl enable pcsd.service 開機自起
Created symlink from /etc/systemd/system/multi-user.target.wants/pcsd.service to /usr/lib/systemd/system/pcsd.service.
[root@server4 ~]# passwd hacluster 給hacluster用戶一個密碼
Changing password for user hacluster.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
集羣主機做認證:
[root@server1 3.0.113]# pcs cluster auth server1 server4 當前只有1和4
Username: hacluster
Password:
server4: Authorized
server1: Authorized
配置集羣服務:
[root@server1 3.0.113]# pcs cluster setup --name mycluster server1 server4
Destroying cluster on nodes: server1, server4...
server1: Stopping Cluster (pacemaker)...
server4: Stopping Cluster (pacemaker)...
server1: Successfully destroyed cluster
server4: Successfully destroyed cluster
Sending 'pacemaker_remote authkey' to 'server1', 'server4'
server1: successful distribution of the file 'pacemaker_remote authkey'
server4: successful distribution of the file 'pacemaker_remote authkey'
Sending cluster config files to the nodes...
server1: Succeeded
server4: Succeeded
Synchronizing pcsd certificates on nodes server1, server4...
server4: Success
server1: Success
Restarting pcsd on the nodes in order to reload the certificates...
server4: Success
server1: Success
[root@server1 3.0.113]# pcs cluster start --all 開啓集羣
server1: Starting Cluster (corosync)...
server4: Starting Cluster (corosync)...
server1: Starting Cluster (pacemaker)...
server4: Starting Cluster (pacemaker)...
它會自動的幫我們開啓兩個服務:
配置集羣開機自起:
[root@server1 3.0.113]# pcs cluster enable --all
server1: Cluster Enabled
server4: Cluster Enabled
查看狀態:
[root@server1 3.0.113]# pcs status
Cluster name: mycluster
WARNINGS:
No stonith devices and stonith-enabled is not false
## stonith設備(fence)是用來配置當系統出現問題後卡死,強制重啓主機的,這樣有利於資源的釋放
我們當前沒有這個設備,所以應該先關閉它
Stack: corosync
Current DC: server1 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Wed May 20 16:25:45 2020
Last change: Wed May 20 16:22:11 2020 by hacluster via crmd on server1
2 nodes configured
0 resources configured
Online: [ server1 server4 ]
No resources
## 我們當前是沒有資源的
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
[root@server1 3.0.113]# pcs status corosync
Membership information
----------------------
Nodeid Votes Name
1 1 server1 (local)
2 1 server4
[root@server1 3.0.113]# corosync-cfgtool -s
Printing ring status.
Local node ID 1
RING ID 0
id = 172.25.254.1
status = ring 0 active with no faults
[root@server1 ~]# pcs property set stonith-enabled=false 關閉stonith-enabled
[root@server1 ~]# crm_verify -L -V 檢測,沒有問題
[root@server1 ~]# pcs property set no-quorum-policy=ignore 如果本機無法進行投票時就忽略
配置vip
[root@server1 ~]# pcs resource list ## 列出資源 有很多
。。。
[root@server1 ~]# pcs resource standards ##大致有這四種類型
lsb
ocf
service
systemd
創建vip,方便我們客戶在訪問的時候有統一的入口,因爲我們的集羣中有多臺主機。
## 添加vip ,ocf類型,
[root@server1 ~]# pcs resource create vip ocf:heartbeat:IPaddr2 ip=172.25.254.100 cidr_netmask=32 op monitor interval=30s
[root@server1 ~]# pcs resource show ## 列出資源
vip (ocf::heartbeat:IPaddr2): Started server1
[root@server1 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:00:e4:9b:44 brd ff:ff:ff:ff:ff:ff
inet 172.25.254.1/24 brd 172.25.254.255 scope global ens3
valid_lft forever preferred_lft forever
inet 172.25.254.100/32 brd 172.25.254.255 scope global ens3
valid_lft forever preferred_lft forever ## vip就已經添加上了
在控制檯查看:
[root@server1 ~]# crm_mon
我們來測試vip的漂移:
server1:
[root@server1 ~]# pcs cluster stop server1
server1: Stopping Cluster (pacemaker)...
server1: Stopping Cluster (corosync)...
server4:
[root@server4 ~]# ip a
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:00:46:90:50 brd ff:ff:ff:ff:ff:ff
inet 172.25.254.4/24 brd 172.25.254.255 scope global ens3
valid_lft forever preferred_lft forever
inet 172.25.254.100/32 brd 172.25.254.255 scope global ens3
valid_lft forever preferred_lft forever ## 實現了VIP的漂移
我們在打開server1:
[root@server1 ~]# pcs cluster start server1
server1: Starting Cluster (corosync)...
server1: Starting Cluster (pacemaker)...
[root@server1 ~]# crm_mon
# 在控制檯查看
vip還在server4上,因爲在企業中,頻繁的切換會影響服務的穩定性。
配置apache
我們在server1和4上安裝apache:
yum install httpd -y
echo server1 > /var/www//html/index.html 配置兩個默認發佈頁
echo server4 > /var/www//html/index.html
創建apache資源;
[root@server1 ~]# pcs resource create apache systemd:httpd op monitor interval=1min
[root@server1 ~]# pcs resource show ## 控制啓動方式爲systemd的方式
vip (ocf::heartbeat:IPaddr2): Started server4
apache (systemd:httpd): Started server1
[root@server1 ~]# systemctl status httpd
● httpd.service - Cluster Controlled httpd
Loaded: loaded (/usr/lib/systemd/system/httpd.service; disabled; vendor preset: disabled)
Drop-In: /run/systemd/system/httpd.service.d
└─50-pacemaker.conf
Active: active (running) since Thu 2020-05-21 10:08:18 CST; 58s ago
apache服務就自動開啓了,
我們查看日誌:
這就說明時集羣控制開啓了httpd。
但是此時:
[root@server1 ~]# pcs resource show
vip (ocf::heartbeat:IPaddr2): Started server4
apache (systemd:httpd): Started server1
## vip和apache不再同一臺服務器上
[root@server1 ~]# curl 172.25.254.100
curl: (7) Failed connect to 172.25.254.100:80; Connection refused
我們就通過vip無法訪問。這是不合理的,所以我們需要創建資源組
將兩個資源放在一個組裏面:
[root@server1 ~]# pcs resource group add apache_group vip apache
[root@server1 ~]# pcs resource show
Resource Group: apache_group
vip (ocf::heartbeat:IPaddr2): Started server4
apache (systemd:httpd): Started server4 ## apache就自動加到server4裏面去了
[root@server1 ~]# systemctl status httpd.service
● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; disabled; vendor preset: disabled)
Active: inactive (dead) ## 並且此時server1的apache是關閉的,說明server4上開啓了
[root@server1 ~]# curl 172.25.254.100
server4 ## 可以訪問到了
添加共享磁盤,創建mfs資源
我們在server3上添加一塊磁盤,用作共享
/dev/sdb
安裝服務軟件:
[root@server3 ~]# yum install targetcli.noarch -y
配置一下;
[root@server3 ~]# targetcli
/> backstores/block create my_disk1 /dev/sdb
Created block storage object my_disk1 using /dev/sdb.
/> iscsi/ create iqn.2020-05.com.example:server3 ## 創建iqn
Created target iqn.2020-05.com.example:server3.
Created TPG 1.
Global pref auto_add_default_portal=true
Created default portal listening on all IPs (0.0.0.0), port 3260.
/> iscsi/iqn.2020-05.com.example:server3/tpg1/luns create /backstores/block/my_disk1
Created LUN 0.
/> iscsi/iqn.2020-05.com.example:server3/tpg1/acls create iqn.2020-05.com.example:client
Created Node ACL for iqn.2020-05.com.example:client #允許這個名稱的訪問
Created mapped LUN 0.
3260端口打開
server1和server4爲客戶端,安裝服務,共享server3的磁盤
這兩臺主機上配置解析:
[root@server1 ~]# yum install iscsi-* -y
[root@server1 ~]# vim /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.2020-05.com.example:client # 改爲上面設定的名稱
[root@server1 ~]# systemctl restart iscsid # 重啓服務
[root@server1 ~]# iscsiadm -m discovery -t st -p 172.25.254.3
172.25.254.3:3260,1 iqn.2020-05.com.example:server3 # 發現
[root@server1 ~]# iscsiadm -m node -l # 登陸
Logging in to [iface: default, target: iqn.2020-05.com.example:server3, portal: 172.25.254.3,3260] (multiple)
Login to [iface: default, target: iqn.2020-05.com.example:server3, portal: 172.25.254.3,3260] successful.
[root@server1 ~]# fdisk -l
就獲取到了這個設備
我們先進行分區:
先之分一個區,然後格式化
mkfs.xfs /dev/sdb1
然後我們給這個設備中存入數據:
[root@server1 ~]# mount /dev/sdb1 /mnt/ 掛載設備
[root@server1 ~]# cp -p /var/lib/mfs/* /mnt/ 存入數據
c[root@server1 ~]# chown mfs.mfs /mnt/ 更改權限屬於mfs用戶
[root@server1 ~]# ls /mnt
changelog.1.mfs changelog.2.mfs changelog.4.mfs changelog.5.mfs metadata.crc metadata.mfs metadata.mfs.back.1 metadata.mfs.empty stats.mfs
[root@server1 ~]# umount /mnt/
[root@server1 ~]# mount /dev/sdb1 /var/lib/mfs/ 掛載到mfs工作目錄下
[root@server1 ~]# systemctl start moosefs-master mfs正常開啓,說明配置正確
[root@server1 ~]# systemctl stop moosefs-master.service
[root@server1 ~]# umount /var/lib/mfs 然後卸載掉
server4:
[root@server4 ~]# yum install -y iscsi-*
[root@server4 ~]# vim /etc/iscsi/initiatorname.iscsi
[root@server4 ~]# iscsiadm -m discovery -t st -p 172.25.254.3
172.25.254.3:3260,1 iqn.2020-05.com.example:server3
[root@server4 ~]# iscsiadm -m node -l
Logging in to [iface: default, target: iqn.2020-05.com.example:server3, portal: 172.25.254.3,3260] (multiple)
Login to [iface: default, target: iqn.2020-05.com.example:server3, portal: 172.25.254.3,3260] successful.
[root@server4 ~]# mount /dev/sdb1 /var/lib/mfs/
[root@server4 ~]# systemctl start moosefs-master
[root@server4 ~]# systemctl stop moosefs-master
[root@server4 ~]# umount /var/lib/mfs/ 測試正常啓動後關閉並卸載掉,這些都是由集羣來完成的
然後我們創建資源:
[root@server1 ~]# pcs resource create mfsdata ocf:heartbeat:Filesystem device=/dev/sdb1 directory=/var/lib/mfs fstype=xfs op monitor interval=30s
## 名稱爲mfsdata, 文件系統類型 ,設備是/dev/sdb1 掛載點爲 /var/lib/mfs
[root@server1 ~]# pcs resource show
Resource Group: apache_group
vip (ocf::heartbeat:IPaddr2): Started server4
apache (systemd:httpd): Started server4
mfsdata (ocf::heartbeat:Filesystem): Started server1
自動掛載上去了
設置自啓動:
May 21 13:20:43 server1 systemd[1]: Stopped MooseFS Master server.
[root@server1 ~]# pcs resource create mfsd systemd:moosefs-master op monitor interval=1min
[root@server1 ~]# pcs resource show
Resource Group: apache_group
vip (ocf::heartbeat:IPaddr2): Started server4
apache (systemd:httpd): Started server4
mfsdata (ocf::heartbeat:Filesystem): Started server1
mfsd (systemd:moosefs-master): Started server1
服務就自動啓動了。
然後我們再次創建資源組,把vip 和 mfsdata 和mfsd 放到一個資源組中。
[root@server1 ~]# pcs resource delete apache
Attempting to stop: apache... Stopped #apache的資源刪掉,我們剛纔只是進行一個測試
[root@server1 ~]# pcs resource group add mfsgroup vip mfsdata mfsd #創建組
[root@server1 ~]# pcs resource show #等及秒鐘再次查看,三個資源就都放到server4裏面去了
Resource Group: mfsgroup
vip (ocf::heartbeat:IPaddr2): Started server4
mfsdata (ocf::heartbeat:Filesystem): Started server4
mfsd (systemd:moosefs-master): Started server4
[root@server1 ~]#
server4中查看:
現在我們測試集羣的高可用,我們關閉server4.在server1中監控:
當前都在server4上。
現在關閉server4的集羣:
[root@server4 ~]# pcs cluster stop server4
server4: Stopping Cluster (pacemaker)...
server4: Stopping Cluster (corosync)...
就切換過來了,server4已經下線了
這時我們在server1上查看服務和vip和掛載:
全部切換過來了,而且server4中的全部關閉了。
集羣中加入fence
開啓server4的集羣:
[root@server4 ~]# pcs cluster start server4
server4: Starting Cluster (corosync)...
server4: Starting Cluster (pacemaker)...
此時的資源都在server1上,我們先安裝fence。fence控制的server1和server4,fence可以控制主機的開關,在主機異常崩潰後會自動重啓主機
[root@server4 ~]# yum install -y fence-virt
[root@server4 ~]# mkdir /etc/cluster 建立控制目錄
[root@server1 ~]# yum install -y fence-virt
[root@server1 ~]# mkdir /etc/cluster
我們用物理來充當客戶端。
安裝:
[root@rhel7host ~]# yum install -y fence-virtd.x86_64 fence-virtd-libvirt.x86_64 fence-virtd-multicast.x86_64
安裝後進行配置
[root@rhel7host ~]# fence_virtd -c
Module search path [/usr/lib64/fence-virt]:
Listener module [multicast]:
Multicast IP Address [225.0.0.12]:
Multicast IP Port [1229]:
Interface [virbr0]: br0 ## 其它設定都選用默認,只用更改接口:
Key File [/etc/cluster/fence_xvm.key]:
Backend module [libvirt]:
Replace /etc/fence_virt.conf with the above [y/N]? y
[root@rhel7host ~]# mkdir /etc/cluster/ 建立目錄
[root@rhel7host ~]# cd /etc/cluster/
[root@rhel7host cluster]# dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=128 count=1 # 生成key文件
1+0 records in
1+0 records out
128 bytes (128 B) copied, 0.000221106 s, 579 kB/s
[root@rhel7host cluster]# ls
fence_xvm.key
[root@rhel7host cluster]# scp fence_xvm.key [email protected]:/etc/cluster/
[email protected]'s password:
fence_xvm.key
## 發送key文件給server1和4
[root@rhel7host cluster]# scp fence_xvm.key [email protected]:/etc/cluster/
[email protected]'s password:
fence_xvm.key
[root@rhel7host cluster]# scp fence_xvm.key [email protected]:/etc/cluster/
[email protected]'s password:
fence_xvm.key
開啓服務:
[root@rhel7host cluster]# systemctl start fence_virtd.service
[root@rhel7host cluster]# netstat -ntlup |grep 1229
udp 0 0 0.0.0.0:1229 0.0.0.0:* 11612/fence_virtd
## 1229端口打開
然後我們在server1上添加資源:
[root@server1 ~]# pcs stonith create vmfence fence_xvm pcmk_host_map="server1:node1;server4:node4" op monitor interval=1min # 主機映射,前面是主機名,:後面是虛擬機名
[root@server1 ~]# pcs property set stonith-enabled=true 開啓stonith設備
[root@server1 ~]# crm_verify -L -V 檢測
[root@server1 ~]# pcs status 查看
Resource Group: mfsgroup
vip (ocf::heartbeat:IPaddr2): Started server1
mfsdata (ocf::heartbeat:Filesystem): Started server1
mfsd (systemd:moosefs-master): Started server1 # 資源都在server1上
vmfence (stonith:fence_xvm): Started server4 # fence開啓在server4上
我們現在掛掉server1的集羣
[root@server1 ~]# pcs cluster stop server1
server1: Stopping Cluster (pacemaker)...
server1: Stopping Cluster (corosync)...
資源切換到了server4上,fence頁再server4上
我們在開啓server1:
[root@server1 ~]# pcs cluster start server1
server1: Starting Cluster (corosync)...
server1: Starting Cluster (pacemaker)...
fence切換到了server1上,說明fence總是不跟資源在一臺主機上。
我們現在使server4異常崩潰,看fence是否可以使server4重啓:
資源全部切換到了server1上,
server4已經重啓了。
當server4重啓後:
fence又跑到了server4上.
這就是fence的工作原理,當主機的內核崩潰後,界面卡住不動,fence就會主動重啓它,讓它釋放出資源,防止了集羣中的資源爭搶.
使用MySql資源
我們刪除vip mfsdata 和mfsd 資源。
[root@server1 ~]# pcs resource delete vip
Attempting to stop: vip... Stopped
[root@server1 ~]# pcs resource delete mfsdata
Attempting to stop: mfsdata... Stopped
[root@server1 ~]# pcs resource delete mfsd
Attempting to stop: mfsd... Stopped
現在就只剩fence了。此時server1和server4的mfs服務都關閉了
server1 和 server4安裝mariadb
yum install -y mariadb-server
刪除/dev/sdb1 中的內容(注意看裏面的隱藏文件):
mount/dev/sdb1 /mnt
rm -fr /mnt/*
umount /mnt/
將/dev/sdb1掛載到mysql上。
[root@server1 ~]# mount /dev/sdb1 /var/lib/mysql/
[root@server1 ~]# chown mysql.mysql /var/lib/mysql/
[root@server1 mysql]# systemctl start mariadb.service
[root@server1 mysql]# ls
aria_log.00000001 aria_log_control ibdata1 ib_logfile0 ib_logfile1 mysql mysql.sock performance_schema test
這樣mysql數據目錄裏面的內容其實是存放在共享目錄裏面。
創建資源:
[root@server1 mysql]# pcs resource create vip ocf:heartbeat:IPaddr2 ip=172.25.254.100 cidr_netmask=32 op monitor interval=30s
[root@server1 ~]# pcs resource create mysql_data ocf:heartbeat:Filesystem \
device=/dev/sdb1 directory=/var/lib/mysql fstype=xfs op monitor interval=30s
[root@server1 ~]# pcs resource create mariadb systemd:mariadb op monitor interval=1min
[root@server1 ~]# pcs resource show
vip (ocf::heartbeat:IPaddr2): Started server1
mysql_data (ocf::heartbeat:Filesystem): Started server1
mariadb (systemd:mariadb): Starting server4
此時資源都不再同一個組上,所以我們創建一個資源組。
[root@server1 ~]# pcs resource group add mysql_group mysql_data vip mariadb
[root@server1 ~]# pcs resource show
Resource Group: mysql_group
mysql_data (ocf::heartbeat:Filesystem): Started server1
vip (ocf::heartbeat:IPaddr2): Started server1
mariadb (systemd:mariadb): Stopping server4
[root@server1 ~]# pcs resource show
Resource Group: mysql_group
mysql_data (ocf::heartbeat:Filesystem): Started server1
vip (ocf::heartbeat:IPaddr2): Started server1
mariadb (systemd:mariadb): Started server1
## 過幾秒纔會更新。
[root@server1 ~]# pcs status
Cluster name: mycluster
Full list of resources:
vmfence (stonith:fence_xvm): Started server4 fence在server4上
Resource Group: mysql_group
mysql_data (ocf::heartbeat:Filesystem): Started server1
vip (ocf::heartbeat:IPaddr2): Started server1
mariadb (systemd:mariadb): Started server1
我們掛掉server1 看server4能否接管資源。
就自動掛載上了。
server1啓動後,fence就掛載到server1上了。