從頭開始搭建集羣(四)
轉變爲Active/Active
目錄
8.1. 需求 ............................................................................ 57
8.2. 安裝一個集羣文件系統 - GFS2 ..................................................... 57
8.3. 整合 Pacemaker-GFS2 ............................................................. 58
8.3.1. 添加 DLM 服務 ............................................................ 59
8.3.2. 添加 GFS2 服務 ........................................................... 60
8.4. 創建一個 GFS2 文件系統 .......................................................... 61
8.4.1. 準備工作 ................................................................. 61
8.4.2. 創建並遷移數據到 GFS2 分區 ............................................... 62
8.5. 8.5. 重新爲集羣配置GFS2 ......................................................... 63
8.6. 重新配置 Pacemaker 爲 Active/Active ............................................. 64
8.6.1. 恢復測試 ................................................................. 67
8.1.?需求 Active/Active集羣一個主要的需求就是數據在兩臺機器上面都是可用並且是同步的。Pacemaker沒有要
求你怎麼實現,你可以用SAN,但是自從DRBD支持多主模式,我們也可以用這個來實現。
唯一的限制是我們要用一個針對集羣的文件系統(我們之前用的ext4,它並不是這樣一個文件系統)。
OCFS2或者GFS2都是可以的,但是在Fedora 13上面,我們用GFS2。
8.2.?安裝一個集羣文件系統 - GFS2
首先我們在各個節點上面安裝GFS2。
[root@pcmk-1 ~]# yum install -y gfs2-utils gfs-pcmk
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package gfs-pcmk.x86_64 0:3.0.5-2.fc12 set to be updated
--> Processing Dependency: libSaCkpt.so.3(OPENAIS_CKPT_B.01.01)(64bit) for package: gfs-
pcmk-3.0.5-2.fc12.x86_64
--> Processing Dependency: dlm-pcmk for package: gfs-pcmk-3.0.5-2.fc12.x86_64
--> Processing Dependency: libccs.so.3()(64bit) for package: gfs-pcmk-3.0.5-2.fc12.x86_64
--> Processing Dependency: libdlmcontrol.so.3()(64bit) for package: gfs-pcmk-3.0.5-2.fc12.x86_64
--> Processing Dependency: liblogthread.so.3()(64bit) for package: gfs-pcmk-3.0.5-2.fc12.x86_64
--> Processing Dependency: libSaCkpt.so.3()(64bit) for package: gfs-pcmk-3.0.5-2.fc12.x86_64
---> Package gfs2-utils.x86_64 0:3.0.5-2.fc12 set to be updated
--> Running transaction check
---> Package clusterlib.x86_64 0:3.0.5-2.fc12 set to be updated
---> Package dlm-pcmk.x86_64 0:3.0.5-2.fc12 set to be updated
---> Package openaislib.x86_64 0:1.1.0-1.fc12 set to be updated
--> Finished Dependency Resolution
Dependencies Resolved
===========================================================================================
?Package ? ? ? ? ? ? ? ?Arch ? ? ? ? ? ? ? Version ? ? ? ? ? ? ? ? ? Repository ? ? ? ?Size
===========================================================================================
Installing:
?gfs-pcmk ? ? ? ? ? ? ? x86_64 ? ? ? ? ? ? 3.0.5-2.fc12 ? ? ? ? ? ? ?custom ? ? ? ? ? 101 k
?gfs2-utils ? ? ? ? ? ? x86_64 ? ? ? ? ? ? 3.0.5-2.fc12 ? ? ? ? ? ? ?custom ? ? ? ? ? 208 k第?8?章?轉變爲Active/Active
58
Installing for dependencies:
?clusterlib ? ? ? ? ? ? x86_64 ? ? ? ? ? ? 3.0.5-2.fc12 ? ? ? ? ? ? ?custom ? ? ? ? ? ?65 k
?dlm-pcmk ? ? ? ? ? ? ? x86_64 ? ? ? ? ? ? 3.0.5-2.fc12 ? ? ? ? ? ? ?custom ? ? ? ? ? ?93 k
?openaislib ? ? ? ? ? ? x86_64 ? ? ? ? ? ? 1.1.0-1.fc12 ? ? ? ? ? ? ?fedora ? ? ? ? ? ?76 k
Transaction Summary
===========================================================================================
Install ? ? ? 5 Package(s)
Upgrade ? ? ? 0 Package(s)
Total download size: 541 k
Downloading Packages:
(1/5): clusterlib-3.0.5-2.fc12.x86_64.rpm ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| ?65 kB ? ? 00:00
(2/5): dlm-pcmk-3.0.5-2.fc12.x86_64.rpm ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| ?93 kB ? ? 00:00
(3/5): gfs-pcmk-3.0.5-2.fc12.x86_64.rpm ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| 101 kB ? ? 00:00
(4/5): gfs2-utils-3.0.5-2.fc12.x86_64.rpm ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| 208 kB ? ? 00:00
(5/5): openaislib-1.1.0-1.fc12.x86_64.rpm ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| ?76 kB ? ? 00:00
-------------------------------------------------------------------------------------------
Total ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 992 kB/s | 541 kB ? ? 00:00
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
? Installing ? ? : clusterlib-3.0.5-2.fc12.x86_64 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 1/5
? Installing ? ? : openaislib-1.1.0-1.fc12.x86_64 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 2/5
? Installing ? ? : dlm-pcmk-3.0.5-2.fc12.x86_64 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 3/5
? Installing ? ? : gfs-pcmk-3.0.5-2.fc12.x86_64 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 4/5
? Installing ? ? : gfs2-utils-3.0.5-2.fc12.x86_64 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 5/5
Installed:
? gfs-pcmk.x86_64 0:3.0.5-2.fc12 ? ? ? ? ? ? ? ? ? ?gfs2-utils.x86_64 0:3.0.5-2.fc12
Dependency Installed:
? clusterlib.x86_64 0:3.0.5-2.fc12 ? dlm-pcmk.x86_64 0:3.0.5-2.fc12
? openaislib.x86_64 0:1.1.0-1.fc12 ?
Complete!
[root@pcmk-1 x86_64]#
警告
If this step fails, it is likely that your version/distribution does not ship the
"Pacemaker" versions of dlm_controld and/or gfs_controld. Normally these files would
be called dlm_controld.pcmk and gfs_controld.pcmk and live in the /usr/sbin directory.
If you cannot locate an installation source for these files, you will need to install
a package called cman and reconfigure Corosync to use it as outlined in 附錄?C, Using
CMAN for Cluster Membership and Quorum.
When using CMAN, you can skip 第?8.3?節 “整合 Pacemaker-GFS2” where dlm-clone and
gfs-clone are created, and proceed directly to 第?8.4?節 “創建一個 GFS2 文件系統”.
8.3.?整合 Pacemaker-GFS2
GFS2要求運行兩個服務,首先是用戶空間訪問內核的分佈式鎖管理(DLM)的接口。 DLM是用來統籌哪個
節點可以處理某個特定的文件,並且與Pacemaker集成來得到節點之間的關係1
和隔離能力。
1
The list of nodes the cluster considers to be available添加 DLM 服務
59
另外一個服務是GFS2自身的控制進程,也是與Pacemaker集成來得到節點之間的關係。
8.3.1.?添加 DLM 服務
DLM控制進程需要在所有可用的集羣節點上面運行,所以我們用shell交互模式來添加一個cloned類型的
資源。
[root@pcmk-1 ~]# crm
crm(live)# cib new stack-glue
INFO: stack-glue shadow CIB created
crm(stack-glue)# configure primitive dlm ocf:pacemaker:controld op monitor interval=120s
crm(stack-glue)# configure clone dlm-clone dlm meta interleave=true
crm(stack-glue)# configure show xml
crm(stack-glue)# configure show
node pcmk-1
node pcmk-2
primitive WebData ocf:linbit:drbd \
? ? ? ? params drbd_resource="wwwdata" \
? ? ? ? op monitor interval="60s"
primitive WebFS ocf:heartbeat:Filesystem \
? ? ? ? params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype="ext4"
primitive WebSite ocf:heartbeat:apache \
? ? ? ? params configfile="/etc/httpd/conf/httpd.conf" \
? ? ? ? op monitor interval="1min"
primitive ClusterIP ocf:heartbeat:IPaddr2 \
? ? ? ? params ip="192.168.122.101" cidr_netmask="32" \
? ? ? ? op monitor interval="30s"
primitive dlm ocf:pacemaker:controld \
op monitor interval="120s"
ms WebDataClone WebData \
? ? ? ? meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
clone dlm-clone dlm \
meta interleave="true"
location prefer-pcmk-1 WebSite 50: pcmk-1
colocation WebSite-with-WebFS inf: WebSite WebFS
colocation fs_on_drbd inf: WebFS WebDataClone:Master
colocation website-with-ip inf: WebSite ClusterIP
order WebFS-after-WebData inf: WebDataClone:promote WebFS:start
order WebSite-after-WebFS inf: WebFS WebSite
order apache-after-ip inf: ClusterIP WebSite
property $id="cib-bootstrap-options" \
? ? ? ? dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \
? ? ? ? cluster-infrastructure="openais" \
? ? ? ? expected-quorum-votes=”2” \
? ? ? ? stonith-enabled="false" \
? ? ? ? no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
? ? ? ? resource-stickiness=”100”
注意
TODO: Explain the meaning of the interleave option
看看配置文件有沒有錯誤,然後退出shell看看集羣的反應。
crm(stack-glue)# cib commit stack-glue
INFO: commited 'stack-glue' shadow CIB to the cluster
crm(stack-glue)# quit第?8?章?轉變爲Active/Active
60
bye
[root@pcmk-1 ~]# crm_mon
============
Last updated: Thu Sep ?3 20:49:54 2009
Stack: openais
Current DC: pcmk-2 - partition with quorum
Version: 1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f
2 Nodes configured, 2 expected votes
5 Resources configured.
============
Online: [ pcmk-1 pcmk-2 ]
WebSite (ocf::heartbeat:apache): ? ? ? ?Started pcmk-2
Master/Slave Set: WebDataClone
? ? ? ? Masters: [ pcmk-1 ]
? ? ? ? Slaves: [ pcmk-2 ]
ClusterIP? ? ? ? (ocf::heartbeat:IPaddr): ? ? ? ?Started pcmk-2
Clone Set: dlm-clone
Started: [ pcmk-2 pcmk-1 ]
WebFS ? (ocf::heartbeat:Filesystem): ? ?Started pcmk-2
8.3.2.?添加 GFS2 服務
一旦DLM啓動了,我們可以加上GFS2的控制進程了。
用crm shell來創建gfs-control這個集羣資源:
[root@pcmk-1 ~]# crm
crm(live)# cib new gfs-glue --force
INFO: gfs-glue shadow CIB created
crm(gfs-glue)# configure primitive gfs-control ocf:pacemaker:controld params daemon=gfs_controld.pcmk args="-g
0" op monitor interval=120s
crm(gfs-glue)# configure clone gfs-clone gfs-control meta interleave=true
現在確保Pacemaker只在有dlm服務運行的節點上面啓動 gfs-control 服務
crm(gfs-glue)# configure colocation gfs-with-dlm INFINITY: gfs-clone dlm-clone
crm(gfs-glue)# configure order start-gfs-after-dlm mandatory: dlm-clone gfs-clone
看看配置文件有沒有錯誤,然後退出shell看看集羣的反應。
crm(gfs-glue)# configure show
node pcmk-1
node pcmk-2
primitive WebData ocf:linbit:drbd \
? ? ? ? params drbd_resource="wwwdata" \
? ? ? ? op monitor interval="60s"
primitive WebFS ocf:heartbeat:Filesystem \
? ? ? ? params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype="ext4"
primitive WebSite ocf:heartbeat:apache \
? ? ? ? params configfile="/etc/httpd/conf/httpd.conf" \
? ? ? ? op monitor interval="1min"
primitive ClusterIP ocf:heartbeat:IPaddr2 \
? ? ? ? params ip="192.168.122.101" cidr_netmask="32" \
? ? ? ? op monitor interval="30s"
primitive dlm ocf:pacemaker:controld \
? ? ? ? op monitor interval="120s"
primitive gfs-control ocf:pacemaker:controld \創建一個 GFS2 文件系統
61
params daemon=”gfs_controld.pcmk” args=”-g 0” \
op monitor interval="120s"
ms WebDataClone WebData \
? ? ? ? meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
clone dlm-clone dlm \
? ? ? ? meta interleave="true"
clone gfs-clone gfs-control \
meta interleave="true"
location prefer-pcmk-1 WebSite 50: pcmk-1
colocation WebSite-with-WebFS inf: WebSite WebFS
colocation fs_on_drbd inf: WebFS WebDataClone:Master
colocation gfs-with-dlm inf: gfs-clone dlm-clone
colocation website-with-ip inf: WebSite ClusterIP
order WebFS-after-WebData inf: WebDataClone:promote WebFS:start
order WebSite-after-WebFS inf: WebFS WebSite
order apache-after-ip inf: ClusterIP WebSite
order start-gfs-after-dlm inf: dlm-clone gfs-clone
property $id="cib-bootstrap-options" \
? ? ? ? dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \
? ? ? ? cluster-infrastructure="openais" \
? ? ? ? expected-quorum-votes=”2” \
? ? ? ? stonith-enabled="false" \
? ? ? ? no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
? ? ? ? resource-stickiness=”100”
crm(gfs-glue)# cib commit gfs-glue
INFO: commited 'gfs-glue' shadow CIB to the cluster
crm(gfs-glue)# quit
bye
[root@pcmk-1 ~]# crm_mon
============
Last updated: Thu Sep ?3 20:49:54 2009
Stack: openais
Current DC: pcmk-2 - partition with quorum
Version: 1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f
2 Nodes configured, 2 expected votes
6 Resources configured.
============
Online: [ pcmk-1 pcmk-2 ]
WebSite (ocf::heartbeat:apache): ? ? ? ?Started pcmk-2
Master/Slave Set: WebDataClone
? ? ? ? Masters: [ pcmk-1 ]
? ? ? ? Slaves: [ pcmk-2 ]
ClusterIP? ? ? ? (ocf::heartbeat:IPaddr): ? ? ? ?Started pcmk-2
Clone Set: dlm-clone
? ? ? ? Started: [ pcmk-2 pcmk-1 ]
Clone Set: gfs-clone
Started: [ pcmk-2 pcmk-1 ]
WebFS ? (ocf::heartbeat:Filesystem): ? ?Started pcmk-1
8.4.?創建一個 GFS2 文件系統
8.4.1.?準備工作
在我們對一個已存在的分區做任何操作之前,我們要確保它沒有被掛載。我們告訴集羣停止WebFS這個
資源來確保這一點。這可以確保其他使用WebFS的資源會正確的依次關閉。
[root@pcmk-1 ~]# crm_resource --resource WebFS --set-parameter target-role --meta --parameter-value Stopped
[root@pcmk-1 ~]# crm_mon
============第?8?章?轉變爲Active/Active
62
Last updated: Thu Sep ?3 15:18:06 2009
Stack: openais
Current DC: pcmk-1 - partition with quorum
Version: 1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f
2 Nodes configured, 2 expected votes
6 Resources configured.
============
Online: [ pcmk-1 pcmk-2 ]
Master/Slave Set: WebDataClone
? ? ? ? Masters: [ pcmk-1 ]
? ? ? ? Slaves: [ pcmk-2 ]
ClusterIP? ? ? ? (ocf::heartbeat:IPaddr):? ? ? ? Started pcmk-1
Clone Set: dlm-clone
? ? ? ? Started: [ pcmk-2 pcmk-1 ]
Clone Set: gfs-clone
? ? ? ? Started: [ pcmk-2 pcmk-1 ]
注意
注意 Apache and WebFS 兩者都已經停止了。
8.4.2.?創建並遷移數據到 GFS2 分區
現在集羣的基層和集成部分都正常運行,我們現在創建一個GFS2分區
警告
這個操作會清除DRBD分區上面的所有數據,請備份重要的數據。
我們要爲GFS2分區指定一系列附加的參數。
首先我們要用 -p選項來指定我們用的是內核的DLM,然後我們用-j來表示我們爲兩個日誌保留足夠的空
間(每個操作文件系統的節點各一個)。
最後,我們用-t來指定lock table的名稱。這個字段的格式是 clustername:fsname(集羣名稱:文件系
統名稱)。fsname的話,我們只要用一個唯一的並且能描述我們這個集羣的名稱就好了,我們用默認的
pcmk。
如果要更改集羣的名稱,找到包含name:pacemaker的配置文件區域,然後添加如下所示的選項即可。
clustername: myname
在每個節點都執行以下命令。
[root@pcmk-1 ~]# mkfs.gfs2 -p lock_dlm -j 2 -t pcmk:web /dev/drbd1
This will destroy any data on /dev/drbd1.
It appears to contain: data
Are you sure you want to proceed? [y/n] y
Device: ? ? ? ? ? ? ? ? ? ?/dev/drbd1
Blocksize: ? ? ? ? ? ? ? ? 4096
Device Size ? ? ? ? ? ? ? ?1.00 GB (131072 blocks)
Filesystem Size: ? ? ? ? ? 1.00 GB (131070 blocks)8.5. 重新爲集羣配置GFS2
63
Journals: ? ? ? ? ? ? ? ? ?2
Resource Groups: ? ? ? ? ? 2
Locking Protocol: ? ? ? ? ?"lock_dlm"
Lock Table: ? ? ? ? ? ? ? ?"pcmk:web"
UUID: ? ? ? ? ? ? ? ? ? ? ?6B776F46-177B-BAF8-2C2B-292C0E078613
[root@pcmk-1 ~]#
然後再遷移數據到這個新的文件系統。現在我們創建一個跟上次不一樣的主頁。
[root@pcmk-1 ~]# mount /dev/drbd1 /mnt/
[root@pcmk-1 ~]# cat <<-END >/mnt/index.html
<html>
<body>My Test Site - GFS2</body>
</html>
END
[root@pcmk-1 ~]# umount /dev/drbd1
[root@pcmk-1 ~]# drbdadm?verify?wwwdata
[root@pcmk-1 ~]#
8.5.?8.5. 重新爲集羣配置GFS2
[root@pcmk-1 ~]# crm
crm(live)# cib new GFS2
INFO: GFS2 shadow CIB created
crm(GFS2)# configure delete WebFS
crm(GFS2)# configure primitive WebFS ocf:heartbeat:Filesystem params device="/dev/drbd/by-res/wwwdata"
directory="/var/www/html" fstype=”gfs2”
現在我們重新創建這個資源, 我們也要重建跟這個資源相關的約束條件,因爲shell會自動刪除跟
WebFS相關的約束條件。
crm(GFS2)# configure colocation WebSite-with-WebFS inf: WebSite WebFS
crm(GFS2)# configure colocation fs_on_drbd inf: WebFS WebDataClone:Master
crm(GFS2)# configure order WebFS-after-WebData inf: WebDataClone:promote WebFS:start
crm(GFS2)# configure order WebSite-after-WebFS inf: WebFS WebSite
crm(GFS2)# configure colocation WebFS-with-gfs-control INFINITY: WebFS gfs-clone
crm(GFS2)# configure order start-WebFS-after-gfs-control mandatory: gfs-clone WebFS
crm(GFS2)# configure show
node pcmk-1
node pcmk-2
primitive WebData ocf:linbit:drbd \
? ? ? ? params drbd_resource="wwwdata" \
? ? ? ? op monitor interval="60s"
primitive WebFS ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype=”gfs2”
primitive WebSite ocf:heartbeat:apache \
? ? ? ? params configfile="/etc/httpd/conf/httpd.conf" \
? ? ? ? op monitor interval="1min"
primitive ClusterIP ocf:heartbeat:IPaddr2 \
? ? ? ? params ip="192.168.122.101" cidr_netmask="32" \
? ? ? ? op monitor interval="30s"
primitive dlm ocf:pacemaker:controld \
? ? ? ? op monitor interval="120s"
primitive gfs-control ocf:pacemaker:controld \
? ?params daemon=”gfs_controld.pcmk” args=”-g 0” \
? ? ? ? op monitor interval="120s"
ms WebDataClone WebData \
? ? ? ? meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"第?8?章?轉變爲Active/Active
64
clone dlm-clone dlm \
? ? ? ? meta interleave="true"
clone gfs-clone gfs-control \
? ? ? ? meta interleave="true"
colocation WebFS-with-gfs-control inf: WebFS gfs-clone
colocation WebSite-with-WebFS inf: WebSite WebFS
colocation fs_on_drbd inf: WebFS WebDataClone:Master
colocation gfs-with-dlm inf: gfs-clone dlm-clone
colocation website-with-ip inf: WebSite ClusterIP
order WebFS-after-WebData inf: WebDataClone:promote WebFS:start
order WebSite-after-WebFS inf: WebFS WebSite
order apache-after-ip inf: ClusterIP WebSite
order start-WebFS-after-gfs-control inf: gfs-clone WebFS
order start-gfs-after-dlm inf: dlm-clone gfs-clone
property $id="cib-bootstrap-options" \
? ? ? ? dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \
? ? ? ? cluster-infrastructure="openais" \
? ? ? ? expected-quorum-votes=”2” \
? ? ? ? stonith-enabled="false" \
? ? ? ? no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
? ? ? ? resource-stickiness=”100”
看看配置文件有沒有錯誤,然後退出shell看看集羣的反應。
crm(GFS2)# cib commit GFS2
INFO: commited 'GFS2' shadow CIB to the cluster
crm(GFS2)# quit
bye
[root@pcmk-1 ~]# crm_mon
============
Last updated: Thu Sep ?3 20:49:54 2009
Stack: openais
Current DC: pcmk-2 - partition with quorum
Version: 1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f
2 Nodes configured, 2 expected votes
6 Resources configured.
============
Online: [ pcmk-1 pcmk-2 ]
WebSite (ocf::heartbeat:apache): ? ? ? ?Started pcmk-2
Master/Slave Set: WebDataClone
? ? ? ? Masters: [ pcmk-1 ]
? ? ? ? Slaves: [ pcmk-2 ]
ClusterIP? ? ? ? (ocf::heartbeat:IPaddr): ? ? ? ?Started pcmk-2
Clone Set: dlm-clone
? ? ? ? Started: [ pcmk-2 pcmk-1 ]
Clone Set: gfs-clone
? ? ? ? Started: [ pcmk-2 pcmk-1 ]
WebFS (ocf::heartbeat:Filesystem): Started pcmk-1
8.6.?重新配置 Pacemaker 爲 Active/Active
基本上所有的事情都已經準備就緒了。最新的DRBD是支持 Primary/Primary(主/主)模式的,並且我們
的文件系統的是針對集羣的。所有我們要做的事情就是重新配置我們的集羣來使用它們(的先進功能)。
這次操作會改很多東西,所以我們再次使用交互模式
[root@pcmk-1 ~]# crm
[root@pcmk-1 ~]# cib new active重新配置 Pacemaker 爲 Active/Active
65
如果我們不能訪問這些服務,那做成 Active/Active是沒有必要的,所以我們要先clone這個IP地址,
克隆的IPaddr2資源用的是iptables規則來保證每個請求都只由一個節點來處理。附件的meta選項告訴
集羣我們要克隆多少個實例(每個節點一個"請求桶")。並且如果其他節點掛了,剩下的節點可以處理所
有的請求。否則這些請求都會被丟棄。
[root@pcmk-1 ~]# configure clone WebIP ClusterIP ?\
? ? ? ? meta globally-unique=”true” clone-max=”2” clone-node-max=”2”
現在我們要告訴集羣如何決定請求怎樣分配給節點。我們要設置 clusterip_hash這個參數來實現它。
打開ClusterIP的配置
[root@pcmk-1 ~]# configure edit ?ClusterIP
在參數行添加以下內容:
clusterip_hash="sourceip"
完整的定義就像下面一樣:
primitive ClusterIP ocf:heartbeat:IPaddr2 \
? ? ? ? params ip="192.168.122.101" cidr_netmask="32" clusterip_hash="sourceip" \
? ? ? ? op monitor interval="30s"
以下是完整的配置
[root@pcmk-1 ~]# crm
crm(live)# cib new active
INFO: active shadow CIB created
crm(active)# configure clone WebIP ClusterIP ?\
? ? ? ? meta globally-unique=”true” clone-max=”2” clone-node-max=”2”
crm(active)# configure show
node pcmk-1
node pcmk-2
primitive WebData ocf:linbit:drbd \
? ? ? ? params drbd_resource="wwwdata" \
? ? ? ? op monitor interval="60s"
primitive WebFS ocf:heartbeat:Filesystem \
? ? ? ? params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype=”gfs2”
primitive WebSite ocf:heartbeat:apache \
? ? ? ? params configfile="/etc/httpd/conf/httpd.conf" \
? ? ? ? op monitor interval="1min"
primitive ClusterIP ocf:heartbeat:IPaddr2 \
? ? ? ? params ip=”192.168.122.101” cidr_netmask=”32” clusterip_hash=”sourceip” \
? ? ? ? op monitor interval="30s"
primitive dlm ocf:pacemaker:controld \
? ? ? ? op monitor interval="120s"
primitive gfs-control ocf:pacemaker:controld \
? ?params daemon=”gfs_controld.pcmk” args=”-g 0” \
? ? ? ? op monitor interval="120s"
ms WebDataClone WebData \
? ? ? ? meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
clone WebIP ClusterIP \
meta globally-unique=”true” clone-max=”2” clone-node-max=”2”
clone dlm-clone dlm \
? ? ? ? meta interleave="true"
clone gfs-clone gfs-control \第?8?章?轉變爲Active/Active
66
? ? ? ? meta interleave="true"
colocation WebFS-with-gfs-control inf: WebFS gfs-clone
colocation WebSite-with-WebFS inf: WebSite WebFS
colocation fs_on_drbd inf: WebFS WebDataClone:Master
colocation gfs-with-dlm inf: gfs-clone dlm-clone
colocation website-with-ip inf: WebSite WebIP
order WebFS-after-WebData inf: WebDataClone:promote WebFS:start
order WebSite-after-WebFS inf: WebFS WebSite
order apache-after-ip inf: WebIP WebSite
order start-WebFS-after-gfs-control inf: gfs-clone WebFS
order start-gfs-after-dlm inf: dlm-clone gfs-clone
property $id="cib-bootstrap-options" \
? ? ? ? dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \
? ? ? ? cluster-infrastructure="openais" \
? ? ? ? expected-quorum-votes=”2” \
? ? ? ? stonith-enabled="false" \
? ? ? ? no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
? ? ? ? resource-stickiness=”100”
請注意所有跟ClusterIP相關的限制都已經被更新到與WebIP相關,這是使用crm shell的另一個好處。
然後我們要把文件系統和apache資源變成clones。同樣的 crm shell會自動更新相關約束。
crm(active)# configure clone WebFSClone WebFS
crm(active)# configure clone WebSiteClone WebSite
最後要告訴集羣現在允許把兩個節點都提升爲 Primary(換句話說 Master).
crm(active)# configure edit WebDataClone
把 master-max 改爲 2
crm(active)# configure show
node pcmk-1
node pcmk-2
primitive WebData ocf:linbit:drbd \
? ? ? ? params drbd_resource="wwwdata" \
? ? ? ? op monitor interval="60s"
primitive WebFS ocf:heartbeat:Filesystem \
? ? ? ? params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype=”gfs2”
primitive WebSite ocf:heartbeat:apache \
? ? ? ? params configfile="/etc/httpd/conf/httpd.conf" \
? ? ? ? op monitor interval="1min"
primitive ClusterIP ocf:heartbeat:IPaddr2 \
? ? ? ? params ip=”192.168.122.101” cidr_netmask=”32” clusterip_hash=”sourceip” \
? ? ? ? op monitor interval="30s"
primitive dlm ocf:pacemaker:controld \
? ? ? ? op monitor interval="120s"
primitive gfs-control ocf:pacemaker:controld \
? ?params daemon=”gfs_controld.pcmk” args=”-g 0” \
? ? ? ? op monitor interval="120s"
ms WebDataClone WebData \
? ? ? ? meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
clone WebFSClone WebFS
clone WebIP ClusterIP ?\
? ? ? ? meta globally-unique=”true” clone-max=”2” clone-node-max=”2”
clone WebSiteClone WebSite
clone dlm-clone dlm \
? ? ? ? meta interleave="true"恢復測試
67
clone gfs-clone gfs-control \
? ? ? ? meta interleave="true"
colocation WebFS-with-gfs-control inf: WebFSClone gfs-clone
colocation WebSite-with-WebFS inf: WebSiteClone WebFSClone
colocation fs_on_drbd inf: WebFSClone WebDataClone:Master
colocation gfs-with-dlm inf: gfs-clone dlm-clone
colocation website-with-ip inf: WebSiteClone WebIP
order WebFS-after-WebData inf: WebDataClone:promote WebFSClone:start
order WebSite-after-WebFS inf: WebFSClone WebSiteClone
order apache-after-ip inf: WebIP WebSiteClone
order start-WebFS-after-gfs-control inf: gfs-clone WebFSClone
order start-gfs-after-dlm inf: dlm-clone gfs-clone
property $id="cib-bootstrap-options" \
? ? ? ? dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \
? ? ? ? cluster-infrastructure="openais" \
? ? ? ? expected-quorum-votes=”2” \
? ? ? ? stonith-enabled="false" \
? ? ? ? no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
? ? ? ? resource-stickiness=”100”
看看配置文件有沒有錯誤,然後退出shell看看集羣的反應。
crm(active)# cib commit active
INFO: commited 'active' shadow CIB to the cluster
crm(active)# quit
bye
[root@pcmk-1 ~]# crm_mon
============
Last updated: Thu Sep ?3 21:37:27 2009
Stack: openais
Current DC: pcmk-2 - partition with quorum
Version: 1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f
2 Nodes configured, 2 expected votes
6 Resources configured.
============
Online: [ pcmk-1 pcmk-2 ]
Master/Slave Set: WebDataClone
? ? ? ? Masters: [ pcmk-1 pcmk-2 ]
Clone Set: dlm-clone
? ? ? ? Started: [ pcmk-2 pcmk-1 ]
Clone Set: gfs-clone
? ? ? ? Started: [ pcmk-2 pcmk-1 ]
Clone Set: WebIP
Started: [ pcmk-1 pcmk-2 ]
Clone Set: WebFSClone
Started: [ pcmk-1 pcmk-2 ]
Clone Set: WebSiteClone
Started: [ pcmk-1 pcmk-2 ]
8.6.1.?恢復測試
注意
TODO: Put one node into standby to demonstrate failover68第?9
69
配置 STONITH
目錄
9.1. 爲什麼需要 STONITH .............................................................. 69
9.2. 你該用什麼樣的STONITH設備。 ..................................................... 69
9.3. 配置STONITH ..................................................................... 69
9.3.1. 例子 ..................................................................... 70
9.1.?爲什麼需要 STONITH
STONITH 是爆其他節點的頭( Shoot-The-Other-Node-In-The-Head)的縮寫,它能保護你的數據不被
不正常的節點破壞或是併發寫入。
因爲如果一個節點沒有相應,但並不代表它沒有在操作你的數據,100%保證數據安全的做法就是在允許
另外一個節點操作數據之前,使用STONITH來保證節點真的下線了。
STONITH另外一個用場是在當集羣服務無法停止的時候。這個時候,集羣可以用STONITH來強制使節點下
線,從而可以安全的得在其他地方啓動服務。
9.2.?你該用什麼樣的STONITH設備。 重要的一點是STONITH設備可以讓集羣區分節點故障和網絡故障。
人們常常犯得一個錯誤就是選擇遠程電源開關作爲STONITH設備(比如許多主板自帶的IPMI控制器) 。在
那種情況下,集羣不能分辨節點是真正的下線了,還是網絡無法連通了。
同樣地, 任何依靠可用節點的設備(比如測試用的基於SSH的“設備”)都是不適當的。
9.3.?配置STONITH
1. 找到正確的STONITH驅動: stonith -L
2. 因爲設備的不同, 配置的參數也不一樣。 想看設備所需設置的參數,可以用: stonith -t {type}
-n
希望開發者選擇了合適的名稱,如果不是這樣,你可以在活動的機器上面執行以下命令來獲得更多信息
。
lrmadmin -M stonith {type} pacemaker
輸出應該是XML格式的文本文件,它包含了更詳細的描述
1. 創建stonith.xml文件 包含了一個原始的源,它定義了資stonith類下面的某個type和這個type所需
的參數。
2. 如果這個設備可以擊殺多個設備並且支持從多個節點連接過來,那我們從這個原始資源創建一個克
隆。
3. 使用cibadmin來更新CIB配置文件:cibadmin -C -o resources --xml-file stonith.xml第?9?章?配置 STONITH
70
9.3.1.?例子
假設我們有一個 包含兩個節點的IBM BladeCenter,控制界面的IP是192.168.122.31,然後我們選擇
external/ibmrsa作爲驅動,然後配置下面列表當中的參數。
[root@pcmk-1 ~]# stonith -t external/ibmrsa -n
hostname ?ipaddr ?userid ?passwd ?type
假設我們知道管理界面的用戶名和密碼,我們要創建一個STONITH的資源:
[root@pcmk-1 ~]# crm
crm(live)# cib new stonith
INFO: stonith shadow CIB created
crm(stonith)# configure primitive rsa-fencing stonith::external/ibmrsa \
? ? ? ? params hostname=”pcmk-1 pcmk-2" ipaddr=192.168.122.31 userid=mgmt passwd=abc123 type=ibm \
? ? ? ? op monitor interval="60s"
crm(stonith)# configure clone Fencing rsa-fencing
最後,我們要重新打開之前禁用的STONITH:
crm(stonith)# configure property stonith-enabled="true"
crm(stonith)# configure show
node pcmk-1
node pcmk-2
primitive WebData ocf:linbit:drbd \
? ? ? ? params drbd_resource="wwwdata" \
? ? ? ? op monitor interval="60s"
primitive WebFS ocf:heartbeat:Filesystem \
? ? ? ? params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype=”gfs2”
primitive WebSite ocf:heartbeat:apache \
? ? ? ? params configfile="/etc/httpd/conf/httpd.conf" \
? ? ? ? op monitor interval="1min"
primitive ClusterIP ocf:heartbeat:IPaddr2 \
? ? ? ? params ip=”192.168.122.101” cidr_netmask=”32” clusterip_hash=”sourceip” \
? ? ? ? op monitor interval="30s"
primitive dlm ocf:pacemaker:controld \
? ? ? ? op monitor interval="120s"
primitive gfs-control ocf:pacemaker:controld \
? ?params daemon=”gfs_controld.pcmk” args=”-g 0” \
? ? ? ? op monitor interval="120s"
primitive rsa-fencing stonith::external/ibmrsa \
params hostname=”pcmk-1 pcmk-2" ipaddr=192.168.122.31 userid=mgmt passwd=abc123 type=ibm \
op monitor interval="60s"
ms WebDataClone WebData \
? ? ? ? meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
clone Fencing rsa-fencing
clone WebFSClone WebFS
clone WebIP ClusterIP ?\
? ? ? ? meta globally-unique=”true” clone-max=”2” clone-node-max=”2”
clone WebSiteClone WebSite
clone dlm-clone dlm \
? ? ? ? meta interleave="true"
clone gfs-clone gfs-control \
? ? ? ? meta interleave="true"
colocation WebFS-with-gfs-control inf: WebFSClone gfs-clone
colocation WebSite-with-WebFS inf: WebSiteClone WebFSClone
colocation fs_on_drbd inf: WebFSClone WebDataClone:Master
colocation gfs-with-dlm inf: gfs-clone dlm-clone
colocation website-with-ip inf: WebSiteClone WebIP
order WebFS-after-WebData inf: WebDataClone:promote WebFSClone:start
order WebSite-after-WebFS inf: WebFSClone WebSiteClone例子
71
order apache-after-ip inf: WebIP WebSiteClone
order start-WebFS-after-gfs-control inf: gfs-clone WebFSClone
order start-gfs-after-dlm inf: dlm-clone gfs-clone
property $id="cib-bootstrap-options" \
? ? ? ? dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \
? ? ? ? cluster-infrastructure="openais" \
? ? ? ? expected-quorum-votes=”2” \
? ? ? ? stonith-enabled="true" \
? ? ? ? no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
? ? ? ? resource-stickiness=”100”7273
附錄?A.?配置扼要重述
目錄
A.1. 最終的集羣配置文件 .............................................................. 73
A.2. 節點列表 ........................................................................ 74
A.3. 集羣選項 ........................................................................ 74
A.4. 資源 ............................................................................ 74
A.4.1. 默認選項 ................................................................. 74
A.4.2. 隔離 ..................................................................... 75
A.4.3. 服務地址 ................................................................. 75
A.4.4. 分佈式鎖控制器 ........................................................... 75
A.4.5. GFS 控制守護進程 ......................................................... 75
A.4.6. DRBD - 共享存儲 .......................................................... 76
A.4.7. 集羣文件系統 ............................................................. 76
A.4.8. Apache ................................................................... 76
A.1.?最終的集羣配置文件
[root@pcmk-1 ~]# crm configure show
node pcmk-1
node pcmk-2
primitive WebData ocf:linbit:drbd \
? ? ? ? params drbd_resource="wwwdata" \
? ? ? ? op monitor interval="60s"
primitive WebFS ocf:heartbeat:Filesystem \
? ? ? ? params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype=”gfs2”
primitive WebSite ocf:heartbeat:apache \
? ? ? ? params configfile="/etc/httpd/conf/httpd.conf" \
? ? ? ? op monitor interval="1min"
primitive ClusterIP ocf:heartbeat:IPaddr2 \
? ? ? ? params ip=”192.168.122.101” cidr_netmask=”32” clusterip_hash=”sourceip” \
? ? ? ? op monitor interval="30s"
primitive dlm ocf:pacemaker:controld \
? ? ? ? op monitor interval="120s"
primitive gfs-control ocf:pacemaker:controld \
? ?params daemon=”gfs_controld.pcmk” args=”-g 0” \
? ? ? ? op monitor interval="120s"
primitive rsa-fencing stonith::external/ibmrsa \
? ? ? ? params hostname=”pcmk-1 pcmk-2" ipaddr=192.168.122.31 userid=mgmt passwd=abc123 type=ibm \
? ? ? ? op monitor interval="60s"
ms WebDataClone WebData \
? ? ? ? meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
clone Fencing rsa-fencing
clone WebFSClone WebFS
clone WebIP ClusterIP ?\
? ? ? ? meta globally-unique=”true” clone-max=”2” clone-node-max=”2”
clone WebSiteClone WebSite
clone dlm-clone dlm \
? ? ? ? meta interleave="true"
clone gfs-clone gfs-control \
? ? ? ? meta interleave="true"
colocation WebFS-with-gfs-control inf: WebFSClone gfs-clone
colocation WebSite-with-WebFS inf: WebSiteClone WebFSClone
colocation fs_on_drbd inf: WebFSClone WebDataClone:Master
colocation gfs-with-dlm inf: gfs-clone dlm-clone
colocation website-with-ip inf: WebSiteClone WebIP附錄?A.?配置扼要重述
74
order WebFS-after-WebData inf: WebDataClone:promote WebFSClone:start
order WebSite-after-WebFS inf: WebFSClone WebSiteClone
order apache-after-ip inf: WebIP WebSiteClone
order start-WebFS-after-gfs-control inf: gfs-clone WebFSClone
order start-gfs-after-dlm inf: dlm-clone gfs-clone
property $id="cib-bootstrap-options" \
? ? ? ? dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \
? ? ? ? cluster-infrastructure="openais" \
? ? ? ? expected-quorum-votes=”2” \
? ? ? ? stonith-enabled=”true” \
? ? ? ? no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
? ? ? ? resource-stickiness=”100”
A.2.?節點列表 這個列表中的集羣節點是集羣自動添加的。
node pcmk-1
node pcmk-2
A.3.?集羣選項 這是集羣自動存儲集羣信息的地方
1. dc-version - DC使用的Pacemaker的版本(包括源代碼的hash)
2. 集羣-基層 - 集羣使用的基層軟件 (heartbeat or openais/corosync)
3. expected-quorum-votes - 預期的集羣最大成員數
以及管理員設置集羣操作的方法選項
1. stonith-enabled=true - 使用STONITH
2. no-quorum-policy=ignore - 忽略達不到法定人數的情況,繼續運行資源
property $id="cib-bootstrap-options" \
? ? ? ? dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \
? ? ? ? cluster-infrastructure="openais" \
? ? ? ? expected-quorum-votes=”2” \
? ? ? ? stonith-enabled=”true” \
? ? ? ? no-quorum-policy="ignore"
A.4.?資源
A.4.1.?默認選項
這裏我們設置所有資源共用的集羣選項
1. resource-stickiness - 資源粘稠值
rsc_defaults $id="rsc-options" \
? ? ? ? resource-stickiness=”100”隔離
75
A.4.2.?隔離
注意
TODO: Add text here
primitive rsa-fencing stonith::external/ibmrsa \
? ? ? ? params hostname=”pcmk-1 pcmk-2" ipaddr=192.168.122.31 userid=mgmt passwd=abc123 type=ibm \
? ? ? ? op monitor interval="60s"
clone Fencing rsa-fencing
A.4.3.?服務地址
用戶需要一個不變的地址來訪問集羣所提供的服務。此外,我們clone了地址,以便在兩個節點上都使
用這個IP。一個iptables規則(resource agent的一部分)是用來確保每個請求只能由兩個節點中的某
一個處理。這些額外的集羣選項告訴我們想要兩個clone(每個節點一個“請求桶”)實例,如果一個
節點失效,那麼剩下的節點處理這兩個請求桶。
primitive ClusterIP ocf:heartbeat:IPaddr2 \
? ? ? ? params ip=”192.168.122.101” cidr_netmask=”32” clusterip_hash=”sourceip” \
? ? ? ? op monitor interval="30s"
clone WebIP ClusterIP ?
? ? ? ? meta globally-unique=”true” clone-max=”2” clone-node-max=”2”
注意
TODO: The RA should check for globally-unique=true when cloned
A.4.4.?分佈式鎖控制器
像GFS2集羣文件系統需要一個鎖管理。該服務啓動守護進程,提供了訪問內核中的鎖管理器的用戶空間
應用程序(如GFS2守護進程)。因爲我們需要它在集羣中的所有可用節點中運行,我們把它clone。
primitive dlm ocf:pacemaker:controld \
? ? ? ? op monitor interval="120s"
clone dlm-clone dlm \
? ? ? ? meta interleave="true
注意
TODO: Confirm interleave is no longer needed
A.4.5.?GFS 控制守護進程
GFS2還需要一個user-space到kernel的橋樑,每個節點上要運行。所以在這裏我們還有一個clone,
但是這一次我們還必須指定它只能運行在有DLM的機器上(colocation 約束),它只能在DLM後啓
動 (order約束)。此外,gfs-control clone應該只關係與其配對的DLM實例,所以我們還要設置
interleave 選項附錄?A.?配置扼要重述
76
primitive gfs-control ocf:pacemaker:controld \
? ?params daemon=”gfs_controld.pcmk” args=”-g 0” \
? ? ? ? op monitor interval="120s"
clone gfs-clone gfs-control \
? ? ? ? meta interleave="true"
colocation gfs-with-dlm inf: gfs-clone dlm-clone
order start-gfs-after-dlm inf: dlm-clone gfs-clone
A.4.6.?DRBD - 共享存儲
在這裏,我們定義了DRBD技術服務,並指定DRBD應該管理的資源(從drbd.conf)。我們讓它作爲主/從
資源,並且爲了active/active,用設置master-max=2來允許兩者都晉升爲master。我們還可以設置通
知選項,這樣,當時集羣的節點的狀態發生改變時,該集羣將告訴DRBD的agent。
primitive WebData ocf:linbit:drbd \
? ? ? ? params drbd_resource="wwwdata" \
? ? ? ? op monitor interval="60s"
ms WebDataClone WebData \
? ? ? ? meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
A.4.7.?集羣文件系統
羣集文件系統可確保文件讀寫正確。我們需要指定我們想掛載並使用GFS2的塊設備(由DRBD提供)。這
又是一個clone,因爲它的目的是在兩個節點上都可用。這些額外的限制確保它只在有gfs-control和
drbd 實例的節點上運行。
primitive WebFS ocf:heartbeat:Filesystem \
? ? ? ? params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype=”gfs2”
clone WebFSClone WebFS
colocation WebFS-with-gfs-control inf: WebFSClone gfs-clone
colocation fs_on_drbd inf: WebFSClone WebDataClone:Master
order WebFS-after-WebData inf: WebDataClone:promote WebFSClone:start
order start-WebFS-after-gfs-control inf: gfs-clone WebFSClone
A.4.8.?Apache
最後我們有了真正的服務,Apache,我們只需要告訴集羣在哪裏可以找到它的主配置文件,並限制其只
在掛載了文件系統和有可用IP節點上運行
primitive WebSite ocf:heartbeat:apache \
? ? ? ? params configfile="/etc/httpd/conf/httpd.conf" \
? ? ? ? op monitor interval="1min"
clone WebSiteClone WebSite
colocation WebSite-with-WebFS inf: WebSiteClone WebFSClone
colocation website-with-ip inf: WebSiteClone WebIP
order apache-after-ip inf: WebIP WebSiteClone
order WebSite-after-WebFS inf: WebFSClone WebSiteClone77
附錄?B.?Sample Corosync Configuration
例?B.1.?Sample Corosync.conf for a two-node cluster
# Please read the Corosync.conf.5 manual page
compatibility: whitetank
totem {
? ? ? ? version: 2
? ? ? ? # How long before declaring a token lost (ms)
? ? ? ? token: ? ? ? ? ?5000
? ? ? ? # How many token retransmits before forming a new configuration
? ? ? ? token_retransmits_before_loss_const: 10
? ? ? ? # How long to wait for join messages in the membership protocol (ms)
? ? ? ? join: ? ? ? ? ? 1000
? ? ? ? # How long to wait for consensus to be achieved before starting a new
? ? ? ? # round of membership configuration (ms)
? ? ? ? consensus: ? ? ?6000
? ? ? ? # Turn off the virtual synchrony filter
? ? ? ? vsftype: ? ? ? ?none
? ? ? ? # Number of messages that may be sent by one processor on receipt of the token
? ? ? ? max_messages: ? 20
? ? ? ? # Stagger sending the node join messages by 1..send_join ms
? ? ? ? send_join: 45
? ? ? ? # Limit generated nodeids to 31-bits (positive signed integers)
? ? ? ? clear_node_high_bit: yes
? ? ? ? # Disable encryption
? ? ? ? secauth:? ? ? ? off
? ? ? ? # How many threads to use for encryption/decryption
? ? ? ? threads: ? ? ? ? ? 0
? ? ? ? # Optionally assign a fixed node id (integer)
? ? ? ? # nodeid: ? ? ? ? 1234
? ? ? ? interface {
? ? ? ? ? ? ? ? ringnumber: 0
? ? ? ? ? ? ? ? # The following values need to be set based on your environment
? ? ? ? ? ? ? ? bindnetaddr: 192.168.122.0
? ? ? ? ? ? ? ? mcastaddr: 226.94.1.1
? ? ? ? ? ? ? ? mcastport: 4000
? ? ? ? }
}
logging {
? ? ? ? debug: off
? ? ? ? fileline: off
? ? ? ? to_syslog: yes
? ? ? ? to_stderr: off
? ? ? ? syslog_facility: daemon
? ? ? ? timestamp: on
}附錄?B.?Sample Corosync Configuration
78
amf {
? ? ? ? mode: disabled
}
79
附錄?C.?Using CMAN for Cluster
Membership and Quorum
目錄
C.1. Background ...................................................................... 79
C.2. Adding CMAN Support ............................................................ 79
C.2.1. Adding CMAN Support - cluster.conf ....................................... 79
C.2.2. Adding CMAN Support - corosync.conf ...................................... 80
C.1.?Background
CMAN v31
is a Corsync plugin that monitors the names and number of active cluster nodes
in order to deliver membership and quorum information to clients (such as the Pacemaker
daemons).
In a traditional Corosync-Pacemaker cluster, a Pacemaker plugin is loaded to provide
membership and quorum information. The motivation for wanting to use CMAN for this
instead, is to ensure all elements of the cluster stack are making decisions based on the
same membership and quorum data. 2
CMAN has been around longer than Pacemaker and is part of the Red Hat cluster stack, so
it is available and supported by many distributions and other pieces of software (such as
OCFS2 and GFS2). For this reason it makes sense to support it.
C.2.?Adding CMAN Support
警告
Be sure to disable the Pacemaker plugin before continuing with this section. In most
cases, this can be achieved by removing /etc/corosync/service.d/pcmk and stopping
Corosync.
C.2.1.?Adding CMAN Support - cluster.conf
The preferred approach for enabling CMAN is to configure cluster.conf and use the /etc/
init.d/cman script to start Corosync. Its far easier to maintain and start automatically
starts the necessary pieces for using GFS2.
You can find some documentation on Installing CMAN and Creating a Basic Cluster
Configuration File3
at the Red Hat website. However please ignore the parts about Fencing,
1
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/Cluster_Suite_Overview/index.html#s2-
clumembership-overview-CSO
2
A failure to do this can lead to what is called internal split-brain - a situation where different parts of the
stack disagree about whether some nodes are alive or dead - which quickly leads to unnecssary down-time and/or
data corruption.
3
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/s1-creating-cluster-
cli-CA.html附錄?C.?Using CMAN for Cluster Membership and Quorum
80
Failover Domains, or HA Services and anything to do with rgmanager and fenced. All these
continue to be handled by Pacemaker in the normal manner.
例?C.1.?Sample cluster.conf for a two-node cluster
<?xml version="1.0"?>
<cluster config_version="1" name="beekhof">
<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="pcmk-1" nodeid="1">
<fence/>
</clusternode>
<clusternode name="pcmk-2" nodeid="2">
<fence/>
</clusternode>
</clusternodes>
<cman/>
<fencedevices/>
<rm/>
</cluster>
C.2.2.?Adding CMAN Support - corosync.conf
The alternative is to add the necessary cman configuration elements to corosync.conf. We
recommend you place these directives in /etc/corosync/service.d/cman as they will differ
between machines.
If you choose this approach, you would continue to start and stop Corosync with it's init
script as previously described in this document.
例?C.2.?Sample corosync.conf extensions for a two-node cluster
[root@pcmk-1 ~]# cat <<-END >>/etc/corosync/service.d/cman
cluster {
name: beekhof
clusternodes {
clusternode {
votes: 1
nodeid: 1
name: pcmk-1
}
clusternode {
votes: 1
nodeid: 2
name: pcmk-2
}
}
cman {
expected_votes: 2
cluster_id: 123
nodename: `uname -n`
two_node: 1
max_queued: 10
}
}
service {Adding CMAN Support - corosync.conf
81
name: corosync_cman
ver: 0
}
quorum {
provider: quorum_cman
}
END
警告
Verify that nodename was set appropriately on each host.8283
附錄?D.?延伸閱讀 Project Website
http://www.clusterlabs.org1
Cluster Commands
一個綜合的指南,包含了Novell所寫的集羣命令,可以在這裏被找到:
http://www.novell.com/documentation/sles11/book_sleha/index.html?page=/documentation/
sles11/book_sleha/data/book_sleha.html
Corosync
http://www.corosync.org2
1
http://www.clusterlabs.org/
2
http://www.corosync.org/8485
附錄?E.?修訂歷史 修訂 1 Mon May 17 2010 Andrew Beekhof [email protected]
Import from Pages.app8687
索引
F
feedback
contact information for this manual,xi88