Linux紅帽企業七集羣的搭建歷程
實驗環境
這裏我們用四臺虛擬機:
server1 192.168.43.71 master
server2 192.168.43.72 chunk
server3 192.168.43.73 chunk
server4 192.168.43.74 作爲master的備用節點
首先搭建好MFS,這裏我們用server1和4做高可用節點,server2和3做數據存儲:
server1和4:
yum install moosefs-master-3.0.113-1.rhsystemd.x86_64.rpm -y
配置啓動腳本:
vim /usr/lib/systemd/system/moosefs-master.service
##### ExecStart=/usr/sbin/mfsmaster start -a
加上-a參數避免異常退出後服務啓動不了
systemctl daemon-reload 刷新
server2和3:
yum install moosefs-chunkserver-3.0.113-1.rhsystemd.x86_64.rpm -y
第一步:在四臺主機上配置yum源(高可用數據庫和存儲數據庫)
第二步:安裝集羣軟件
server1和server4上安裝集羣軟件:
yum install pacemaker corosync pcs -y
pacemaker 主軟件
corosync 同步複製和心跳檢測
pcs 命令行
安裝完成後會在系統中生成hacluster用戶:
[root@server1 ~]# id hacluster
uid=189(hacluster) gid=189(haclient) groups=189(haclient)
第三步:在server1 和 4 主機之間做免密登錄
[root@server1 ~]# ssh-keygen
[root@server1 ~]# ssh-copy-id server1
[root@server1 ~]# ssh-copy-id server4
第四步:配置集羣
server1和4:
[root@server1 ~]# systemctl start pcsd.service
[root@server1 ~]# systemctl enable pcsd.service
Created symlink from /etc/systemd/system/multi-user.target.wants/pcsd.service to /usr/lib/systemd/system/pcsd.service.
[root@server1 ~]# passwd hacluster
Changing password for user hacluster.
New password:
BAD PASSWORD: The password is shorter than 8 characters
Retype new password:
passwd: all authentication tokens updated successfully.
集羣主機之間做認證:(當前我們的集羣主機是server1和4)
[root@server1 ~]# pcs cluster auth server1 server4
Username: hacluster
Password:
server4: Authorized
server1: Authorized
第五步:配置集羣服務
[root@server1 ~]# pcs cluster setup --name mycluster server1 server4
[root@server1 ~]# pcs cluster start --all #開啓所有集羣
配置完成之後它會自動的幫我們開啓兩個服務
第六步:設置集羣開機自啓
第七步:配置vip
[root@server1 ~]# pcs resource list ## 列出資源
...
[root@server1 ~]# pcs resource standards ##上面列出的資源有這四種類型
lsb
ocf
service
systemd
因爲我們的集羣中有多臺主機,創建vip,方便我們客戶在訪問的時候有統一的入口。
## 添加vip ,ocf類型,
[root@server1 ~]# pcs resource create vip ocf:heartbeat:IPaddr2 ip=192.168.43.100 cidr_netmask=32 op monitor interval=30s
[root@server1 ~]# pcs resource show ## 列出資源
vip (ocf::heartbeat:IPaddr2): Started server1
[root@server1 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:a5:b1:7f brd ff:ff:ff:ff:ff:ff
inet 192.168.43.71/24 brd 192.168.43.255 scope global eth0
valid_lft forever preferred_lft forever
inet 192.168.43.100/32 brd 192.168.43.255 scope global eth0 ## vip就已經添加上了
valid_lft forever preferred_lft forever
inet6 2409:8a70:fdcd:1940:20c:29ff:fea5:b17f/64 scope global mngtmpaddr dynamic
valid_lft 258754sec preferred_lft 172354sec
inet6 fe80::20c:29ff:fea5:b17f/64 scope link
valid_lft forever preferred_lft forever
[root@server1 ~]# crm_mon #在控制檯查看
測試vip的漂移:
在server1中:
[root@server1 ~]# pcs cluster stop server1
server1: Stopping Cluster (pacemaker)...
server1: Stopping Cluster (corosync)...
在server4中:
[root@server4 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:6c:72:0c brd ff:ff:ff:ff:ff:ff
inet 192.168.43.74/24 brd 192.168.43.255 scope global eth0
valid_lft forever preferred_lft forever
inet 192.168.43.100/32 brd 192.168.43.255 scope global eth0 ## 實現了VIP的漂移
valid_lft forever preferred_lft forever
inet6 2409:8a70:fdcd:1940:20c:29ff:fe6c:720c/64 scope global mngtmpaddr dynamic
valid_lft 259170sec preferred_lft 172770sec
inet6 fe80::20c:29ff:fe6c:720c/64 scope link
valid_lft forever preferred_lft forever
在server1中:
[root@server1 ~]# pcs cluster start server1
server1: Starting Cluster (corosync)...
server1: Starting Cluster (pacemaker)...
[root@server1 ~]# crm_mon # 在控制檯查看
配置apache
在server1和4上安裝apache:
yum install httpd -y
echo server1 > /var/www//html/index.html 配置兩個默認發佈頁
echo server4 > /var/www//html/index.html
創建apache資源:
[root@server1 ~]# pcs resource create apache systemd:httpd op monitor interval=1min
## 控制啓動方式爲systemd的方式
[root@server1 ~]# pcs resource show
vip (ocf::heartbeat:IPaddr2): Started server4
apache (systemd:httpd): Started server1
[root@server1 ~]# systemctl status httpd
● httpd.service - Cluster Controlled httpd
Loaded: loaded (/usr/lib/systemd/system/httpd.service; disabled; vendor preset: disabled)
Drop-In: /run/systemd/system/httpd.service.d
└─50-pacemaker.conf
Active: active (running) since Sat 2020-05-23 13:23:00 CST; 38s ago #apache服務就自動開啓
Docs: man:httpd(8)
man:apachectl(8)
Main PID: 26925 (httpd)
我們不難發現是集羣控制開啓了httpd!!!!!!!!!!!!!!!!
但是此時我們發現一個問題:
創建資源組,將兩個資源放在一個組裏面:
[root@server1 ~]# pcs resource group add apache_group vip apache
添加共享磁盤,創建mfs資源
在server3上添加一塊磁盤,用作共享
/dev/sda
安裝服務軟件
前提:server1和server4爲客戶端,安裝服務,共享server3的磁盤,所以這兩臺主機上配置解析
[root@server3 ~]# yum install targetcli.noarch -y
配置一下:
[root@server3 ~]# targetcli
/> backstores/block create my_disk1 /dev/sdb
Created block storage object my_disk1 using /dev/sdb.
/> iscsi/ create iqn.2020-05.com.example:server3 ## 創建iqn
Created target iqn.2020-05.com.example:server3.
Created TPG 1.
Global pref auto_add_default_portal=true
Created default portal listening on all IPs (0.0.0.0), port 3260.
/> iscsi/iqn.2020-05.com.example:server3/tpg1/luns create /backstores/block/my_disk1
Created LUN 0.
/> iscsi/iqn.2020-05.com.example:server3/tpg1/acls create iqn.2020-05.com.example:client
Created Node ACL for iqn.2020-05.com.example:client #允許這個名稱的訪問
Created mapped LUN 0.
3260端口已經打開
在server3創建了一塊共享磁盤,這時候server1和server4充當客戶端去使用它
在server1中:
[root@server1 ~]# yum install iscsi-* -y
[root@server1 ~]# vim /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.2020-05.com.example:client # 改爲上面設定的名稱
[root@server1 ~]# systemctl restart iscsid
[root@server1 ~]# iscsiadm -m discovery -t st -p 192.168.43.73 發現磁盤
192.168.43.73:3260,1 iqn.2020-05.com.example:server3
[root@server1 ~]# iscsiadm -m node -l
Logging in to [iface: default, target: iqn.2020-05.com.example:server3, portal: 192.168.43.73,3260] (multiple)
Login to [iface: default, target: iqn.2020-05.com.example:server3, portal: 192.168.43.73,3260] successful. 登陸成功
說明獲取到了這個設備,我們先進行分區
[root@server1 ~]# fdisk -l
1. 先分一個區,然後格式化
mkfs.xfs /dev/sdb1
2. 然後我們給這個設備中存入數據:
[root@server1 ~]# mkfs.xfs /dev/sda1
meta-data=/dev/sda1 isize=512 agcount=4, agsize=655104 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0, sparse=0
data = bsize=4096 blocks=2620416, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
[root@server1 ~]# mount /dev/sda1 /mnt/ #掛載設備
[root@server1 ~]# cp -p /var/lib/mfs/ * /mnt/ #存入數據
[root@server1 ~]# chown mfs.mfs /mnt/ #更改權限屬於mfs用戶
[root@server1 ~]# ls /mnt
changelog.1.mfs changelog.5.mfs changelog.8.mfs metadata.mfs.back.1
changelog.2.mfs changelog.6.mfs metadata.crc metadata.mfs.empty
changelog.4.mfs changelog.7.mfs metadata.mfs stats.mfs
[root@server1 ~]# umount /mnt/
[root@server1 ~]# mount /dev/sda1 /var/lib/mfs/ #掛載到mfs工作目錄下
[root@server1 ~]# systemctl start moosefs-master #mfs正常開啓,說明配置正確
[root@server1 ~]# systemctl stop moosefs-master.service
[root@server1 ~]# umount /var/lib/mfs #然後卸載掉
在server4中:
[root@server4 ~]# yum install -y iscsi-*
[root@server4 ~]# vim /etc/iscsi/initiatorname.iscsi
[root@server4 ~]# iscsiadm -m discovery -t st -p 192.168.43.73
192.168.43.73:3260,1 iqn.2020-05.com.example:server3
[root@server4 ~]# iscsiadm -m node -l
Logging in to [iface: default, target: iqn.2020-05.com.example:server3, portal: 192.168.43.73,3260] (multiple)
Login to [iface: default, target: iqn.2020-05.com.example:server3, portal: 192.168.43.73,3260] successful.
[root@server4 ~]# mount /dev/sda1 /var/lib/mfs/
[root@server4 ~]# systemctl start moosefs-master
[root@server4 ~]# systemctl stop moosefs-master
[root@server4 ~]# umount /var/lib/mfs/
測試正常啓動後關閉並卸載掉,這些都是由集羣來完成的
接下來我們創建資源:
[root@server1 ~]# pcs resource create mfsdata ocf:heartbeat:Filesystem device=/dev/sda1 directory=/var/lib/mfs fstype=xfs op monitor interval=30s
## 名稱爲mfsdata, 文件系統類型 ,設備是/dev/sdb1 掛載點爲 /var/lib/mfs
[root@server1 ~]# pcs resource show
Resource Group: apache_group
vip (ocf::heartbeat:IPaddr2): Started server4
apache (systemd:httpd): Started server4
mfsdata (ocf::heartbeat:Filesystem): Started server1
我們發現,自動掛載上去了!!!!!!!
設置自啓動;
[root@server1 ~]# pcs resource create mfsd systemd:moosefs-master op monitor interval=1min
[root@server1 ~]# pcs resource show
Resource Group: apache_group
vip (ocf::heartbeat:IPaddr2): Started server4
apache (systemd:httpd): Started server4
mfsdata (ocf::heartbeat:Filesystem): Started server1
mfsd (systemd:moosefs-master): Started server1
[root@server1 ~]# systemctl status moosefs-master.service
● moosefs-master.service - Cluster Controlled moosefs-master
Loaded: loaded (/usr/lib/systemd/system/moosefs-master.service; disabled; vendor preset: disabled)
Drop-In: /run/systemd/system/moosefs-master.service.d
└─50-pacemaker.conf
Active: active (running) since Sat 2020-05-23 15:38:16 CST; 38s ago
Process: 34660 ExecStart=/usr/sbin/mfsmaster start -a (code=exited, status=0/SUCCESS)
Main PID: 34662 (mfsmaster)
CGroup: /system.slice/moosefs-master.service
├─34662 /usr/sbin/mfsmaster start -a
└─34663 mfsmaster (data writer) t -a
服務就自動啓動了
我們再次創建資源組,把vip 和 mfsdata 和mfsd 放到一個資源組中
[root@server1 ~]# pcs resource delete apache
Attempting to stop: apache... Stopped #apache的資源刪掉,我們剛纔只是進行一個測試
[root@server1 ~]# pcs resource group add mfsgroup vip mfsdata mfsd #創建組
[root@server1 ~]# pcs resource show #等及秒鐘再次查看,三個資源就都放到server4裏面去了
Resource Group: mfsgroup
vip (ocf::heartbeat:IPaddr2): Started server4
mfsdata (ocf::heartbeat:Filesystem): Started server4
mfsd (systemd:moosefs-master): Started server4
[root@server1 ~]#
server4中查看:
接下來我們測試集羣的高可用,我們關閉server4,並且在server1中監控
在server1中:
[root@server1 ~]# crm_mon 監控
在server4中:
[root@server4 ~]# pcs cluster stop server4 關閉server4的集羣
server4: Stopping Cluster (pacemaker)...
server4: Stopping Cluster (corosync)...
[root@server4 ~]#
這時候發現切換爲server1了,server4已經下線了
在server1上查看服務和vip和掛載:
server4呢?
我們發現,全部切換過來了,而且server4中的全部關閉了。
集羣中加入fence
開啓server4的集羣
[root@server4 ~]# pcs cluster start server4
server4: Starting Cluster (corosync)...
server4: Starting Cluster (pacemaker)...
雖然開啓了server4,但是此時的資源都在server1上,我們先安裝fence。fence控制的server1和server4,fence可以控制主機的開關,在主機異常崩潰後會自動重啓主機
[root@server4 ~]# yum install -y fence-virt
[root@server4 ~]# mkdir /etc/cluster 建立控制目錄
[root@server1 ~]# yum install -y fence-virt
[root@server1 ~]# mkdir /etc/cluster
下面我們用server5來充當客戶端
安裝:
[root@server5 ~]# yum install -y fence-virtd.x86_64 fence-virtd-libvirt.x86_64 fence-virtd-multicast.x86_64
安裝後進行配置:
[root@server5 ~]# fence_virtd -c
Module search path [/usr/lib64/fence-virt]:
Listener module [multicast]:
Multicast IP Address [225.0.0.12]:
Multicast IP Port [1229]:
Interface [virbr0]: br0 ## 其它設定都選用默認,只用更改接口:
Key File [/etc/cluster/fence_xvm.key]:
Backend module [libvirt]:
Replace /etc/fence_virt.conf with the above [y/N]? y
[root@server5 ~]# mkdir /etc/cluster/ 建立目錄
[root@server5 ~]# cd /etc/cluster/
[root@server5 cluster]# dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=128 count=1 # 生成key文件
1+0 records in
1+0 records out
128 bytes (128 B) copied, 0.000221106 s, 579 kB/s
[root@server5 cluster]# ls
fence_xvm.key
## 發送key文件給server1和4
[root@server5 cluster]# scp fence_xvm.key root@192.168.43.71:/etc/cluster/
root@192.168.43.71's password:
fence_xvm.key
[root@server5 cluster]# scp fence_xvm.key root@192.168.43.74:/etc/cluster/
root@192.168.43.74's password:
fence_xvm.key
開啓服務:
[root@server5 cluster]# systemctl start fence_virtd.service
[root@server5 cluster]# netstat -ntlup |grep 1229
udp 0 0 0.0.0.0:1229 0.0.0.0:* 11612/fence_virtd
## 1229端口打開
在server1上添加資源:
[root@server1 ~]# pcs stonith create vmfence fence_xvm pcmk_host_map="server1:node1;server4:node4" op monitor interval=1min # 主機映射,前面是主機名,:後面是虛擬機名
[root@server1 ~]# pcs property set stonith-enabled=true 開啓stonith設備
[root@server1 ~]# crm_verify -L -V 檢測
[root@server1 ~]# pcs status 查看
Resource Group: mfsgroup
vip (ocf::heartbeat:IPaddr2): Started server1
mfsdata (ocf::heartbeat:Filesystem): Started server1
mfsd (systemd:moosefs-master): Started server1 # 資源都在server1上
vmfence (stonith:fence_xvm): Started server4 # fence開啓在server4上
現在掛掉server1的集羣
[root@server1 ~]# pcs cluster stop server1
server1: Stopping Cluster (pacemaker)...
server1: Stopping Cluster (corosync)...
資源切換到了server4上,fence頁再server4上
我們在開啓server1:
[root@server1 ~]# pcs cluster start server1
server1: Starting Cluster (corosync)...
server1: Starting Cluster (pacemaker)...
fence切換到了server1上,說明fence總是不跟資源在一臺主機上。
我們現在使server4異常崩潰,看fence是否可以使server4重啓:
[root@server4 ~]# echo c > /proc/sysrq/trigger
資源全部切換到了server1上
server4已經重啓了。
當server4重啓後:fence又跑到了server4上.
這就是fence的工作原理,當主機的內核崩潰後,界面卡住不動,fence就會主動重啓它,讓它釋放出資源,防止了集羣中的資源爭搶.