從頭開始搭建集羣(四)

 從頭開始搭建集羣(四)

轉變爲Active/Active

目錄

8.1. 需求  ............................................................................   57

8.2. 安裝一個集羣文件系統 - GFS2  .....................................................  57

8.3. 整合 Pacemaker-GFS2  .............................................................   58

8.3.1. 添加 DLM 服務  ............................................................  59

8.3.2. 添加 GFS2 服務  ...........................................................  60

8.4. 創建一個 GFS2 文件系統  ..........................................................  61

8.4.1. 準備工作  .................................................................   61

8.4.2. 創建並遷移數據到 GFS2 分區  ...............................................   62

8.5. 8.5. 重新爲集羣配置GFS2  .........................................................   63

8.6. 重新配置 Pacemaker 爲 Active/Active  .............................................  64

8.6.1. 恢復測試  .................................................................   67

8.1.?需求 Active/Active集羣一個主要的需求就是數據在兩臺機器上面都是可用並且是同步的。Pacemaker沒有要

求你怎麼實現,你可以用SAN,但是自從DRBD支持多主模式,我們也可以用這個來實現。

唯一的限制是我們要用一個針對集羣的文件系統(我們之前用的ext4,它並不是這樣一個文件系統)。

 OCFS2或者GFS2都是可以的,但是在Fedora 13上面,我們用GFS2。

8.2.?安裝一個集羣文件系統 - GFS2

首先我們在各個節點上面安裝GFS2。

[root@pcmk-1 ~]# yum install -y gfs2-utils gfs-pcmk

Setting up Install Process

Resolving Dependencies

--> Running transaction check

---> Package gfs-pcmk.x86_64 0:3.0.5-2.fc12 set to be updated

--> Processing Dependency: libSaCkpt.so.3(OPENAIS_CKPT_B.01.01)(64bit) for package: gfs-

pcmk-3.0.5-2.fc12.x86_64

--> Processing Dependency: dlm-pcmk for package: gfs-pcmk-3.0.5-2.fc12.x86_64

--> Processing Dependency: libccs.so.3()(64bit) for package: gfs-pcmk-3.0.5-2.fc12.x86_64

--> Processing Dependency: libdlmcontrol.so.3()(64bit) for package: gfs-pcmk-3.0.5-2.fc12.x86_64

--> Processing Dependency: liblogthread.so.3()(64bit) for package: gfs-pcmk-3.0.5-2.fc12.x86_64

--> Processing Dependency: libSaCkpt.so.3()(64bit) for package: gfs-pcmk-3.0.5-2.fc12.x86_64

---> Package gfs2-utils.x86_64 0:3.0.5-2.fc12 set to be updated

--> Running transaction check

---> Package clusterlib.x86_64 0:3.0.5-2.fc12 set to be updated

---> Package dlm-pcmk.x86_64 0:3.0.5-2.fc12 set to be updated

---> Package openaislib.x86_64 0:1.1.0-1.fc12 set to be updated

--> Finished Dependency Resolution

Dependencies Resolved

===========================================================================================

?Package ? ? ? ? ? ? ? ?Arch ? ? ? ? ? ? ? Version ? ? ? ? ? ? ? ? ? Repository ? ? ? ?Size

===========================================================================================

Installing:

?gfs-pcmk ? ? ? ? ? ? ? x86_64 ? ? ? ? ? ? 3.0.5-2.fc12 ? ? ? ? ? ? ?custom ? ? ? ? ? 101 k

?gfs2-utils ? ? ? ? ? ? x86_64 ? ? ? ? ? ? 3.0.5-2.fc12 ? ? ? ? ? ? ?custom ? ? ? ? ? 208 k第?8?章?轉變爲Active/Active

58

Installing for dependencies:

?clusterlib ? ? ? ? ? ? x86_64 ? ? ? ? ? ? 3.0.5-2.fc12 ? ? ? ? ? ? ?custom ? ? ? ? ? ?65 k

?dlm-pcmk ? ? ? ? ? ? ? x86_64 ? ? ? ? ? ? 3.0.5-2.fc12 ? ? ? ? ? ? ?custom ? ? ? ? ? ?93 k

?openaislib ? ? ? ? ? ? x86_64 ? ? ? ? ? ? 1.1.0-1.fc12 ? ? ? ? ? ? ?fedora ? ? ? ? ? ?76 k

Transaction Summary

===========================================================================================

Install ? ? ? 5 Package(s)

Upgrade ? ? ? 0 Package(s)

Total download size: 541 k

Downloading Packages:

(1/5): clusterlib-3.0.5-2.fc12.x86_64.rpm ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| ?65 kB ? ? 00:00

(2/5): dlm-pcmk-3.0.5-2.fc12.x86_64.rpm ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| ?93 kB ? ? 00:00

(3/5): gfs-pcmk-3.0.5-2.fc12.x86_64.rpm ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| 101 kB ? ? 00:00

(4/5): gfs2-utils-3.0.5-2.fc12.x86_64.rpm ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| 208 kB ? ? 00:00

(5/5): openaislib-1.1.0-1.fc12.x86_64.rpm ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| ?76 kB ? ? 00:00

-------------------------------------------------------------------------------------------

Total ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 992 kB/s | 541 kB ? ? 00:00

Running rpm_check_debug

Running Transaction Test

Finished Transaction Test

Transaction Test Succeeded

Running Transaction

? Installing ? ? : clusterlib-3.0.5-2.fc12.x86_64 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 1/5 

? Installing ? ? : openaislib-1.1.0-1.fc12.x86_64 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 2/5 

? Installing ? ? : dlm-pcmk-3.0.5-2.fc12.x86_64 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 3/5 

? Installing ? ? : gfs-pcmk-3.0.5-2.fc12.x86_64 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 4/5 

? Installing ? ? : gfs2-utils-3.0.5-2.fc12.x86_64 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 5/5 

Installed:

? gfs-pcmk.x86_64 0:3.0.5-2.fc12 ? ? ? ? ? ? ? ? ? ?gfs2-utils.x86_64 0:3.0.5-2.fc12

Dependency Installed:

? clusterlib.x86_64 0:3.0.5-2.fc12 ? dlm-pcmk.x86_64 0:3.0.5-2.fc12 

? openaislib.x86_64 0:1.1.0-1.fc12 ?

Complete!

[root@pcmk-1 x86_64]#

警告

If this step fails, it is likely that your version/distribution does not ship the

"Pacemaker" versions of dlm_controld and/or gfs_controld. Normally these files would

be called dlm_controld.pcmk and gfs_controld.pcmk and live in the /usr/sbin directory.

If you cannot locate an installation source for these files, you will need to install

a package called cman and reconfigure Corosync to use it as outlined in 附錄?C, Using

CMAN for Cluster Membership and Quorum.

When using CMAN, you can skip 第?8.3?節 “整合 Pacemaker-GFS2” where dlm-clone and

gfs-clone are created, and proceed directly to 第?8.4?節 “創建一個 GFS2 文件系統”.

8.3.?整合 Pacemaker-GFS2

GFS2要求運行兩個服務,首先是用戶空間訪問內核的分佈式鎖管理(DLM)的接口。 DLM是用來統籌哪個

節點可以處理某個特定的文件,並且與Pacemaker集成來得到節點之間的關係1

和隔離能力。

1

 The list of nodes the cluster considers to be available添加 DLM 服務

59

另外一個服務是GFS2自身的控制進程,也是與Pacemaker集成來得到節點之間的關係。

8.3.1.?添加 DLM 服務

DLM控制進程需要在所有可用的集羣節點上面運行,所以我們用shell交互模式來添加一個cloned類型的

資源。

[root@pcmk-1 ~]# crm

crm(live)# cib new stack-glue

INFO: stack-glue shadow CIB created

crm(stack-glue)# configure primitive dlm ocf:pacemaker:controld op monitor interval=120s

crm(stack-glue)# configure clone dlm-clone dlm meta interleave=true

crm(stack-glue)# configure show xml

crm(stack-glue)# configure show

node pcmk-1

node pcmk-2

primitive WebData ocf:linbit:drbd \

? ? ? ? params drbd_resource="wwwdata" \

? ? ? ? op monitor interval="60s"

primitive WebFS ocf:heartbeat:Filesystem \

? ? ? ? params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype="ext4"

primitive WebSite ocf:heartbeat:apache \

? ? ? ? params configfile="/etc/httpd/conf/httpd.conf" \

? ? ? ? op monitor interval="1min"

primitive ClusterIP ocf:heartbeat:IPaddr2 \

? ? ? ? params ip="192.168.122.101" cidr_netmask="32" \

? ? ? ? op monitor interval="30s"

primitive dlm ocf:pacemaker:controld \

 op monitor interval="120s"

ms WebDataClone WebData \

? ? ? ? meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"

clone dlm-clone dlm \

 meta interleave="true"

location prefer-pcmk-1 WebSite 50: pcmk-1

colocation WebSite-with-WebFS inf: WebSite WebFS

colocation fs_on_drbd inf: WebFS WebDataClone:Master

colocation website-with-ip inf: WebSite ClusterIP

order WebFS-after-WebData inf: WebDataClone:promote WebFS:start

order WebSite-after-WebFS inf: WebFS WebSite

order apache-after-ip inf: ClusterIP WebSite

property $id="cib-bootstrap-options" \

? ? ? ? dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \

? ? ? ? cluster-infrastructure="openais" \

? ? ? ? expected-quorum-votes=”2” \

? ? ? ? stonith-enabled="false" \

? ? ? ? no-quorum-policy="ignore"

rsc_defaults $id="rsc-options" \

? ? ? ? resource-stickiness=”100”

注意

TODO: Explain the meaning of the interleave option

看看配置文件有沒有錯誤,然後退出shell看看集羣的反應。

crm(stack-glue)# cib commit stack-glue

INFO: commited 'stack-glue' shadow CIB to the cluster

crm(stack-glue)# quit第?8?章?轉變爲Active/Active

60

bye

[root@pcmk-1 ~]# crm_mon

============

Last updated: Thu Sep ?3 20:49:54 2009

Stack: openais

Current DC: pcmk-2 - partition with quorum

Version: 1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f

2 Nodes configured, 2 expected votes

5 Resources configured.

============

Online: [ pcmk-1 pcmk-2 ]

WebSite (ocf::heartbeat:apache): ? ? ? ?Started pcmk-2

Master/Slave Set: WebDataClone

? ? ? ? Masters: [ pcmk-1 ]

? ? ? ? Slaves: [ pcmk-2 ]

ClusterIP? ? ? ? (ocf::heartbeat:IPaddr): ? ? ? ?Started pcmk-2

Clone Set: dlm-clone

 Started: [ pcmk-2 pcmk-1 ]

WebFS ? (ocf::heartbeat:Filesystem): ? ?Started pcmk-2

8.3.2.?添加 GFS2 服務

一旦DLM啓動了,我們可以加上GFS2的控制進程了。

用crm shell來創建gfs-control這個集羣資源:

[root@pcmk-1 ~]# crm

crm(live)# cib new gfs-glue --force

INFO: gfs-glue shadow CIB created

crm(gfs-glue)# configure primitive gfs-control ocf:pacemaker:controld params daemon=gfs_controld.pcmk args="-g

 0" op monitor interval=120s

crm(gfs-glue)# configure clone gfs-clone gfs-control meta interleave=true

現在確保Pacemaker只在有dlm服務運行的節點上面啓動 gfs-control 服務

crm(gfs-glue)# configure colocation gfs-with-dlm INFINITY: gfs-clone dlm-clone

crm(gfs-glue)# configure order start-gfs-after-dlm mandatory: dlm-clone gfs-clone

看看配置文件有沒有錯誤,然後退出shell看看集羣的反應。

crm(gfs-glue)# configure show

node pcmk-1

node pcmk-2

primitive WebData ocf:linbit:drbd \

? ? ? ? params drbd_resource="wwwdata" \

? ? ? ? op monitor interval="60s"

primitive WebFS ocf:heartbeat:Filesystem \

? ? ? ? params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype="ext4"

primitive WebSite ocf:heartbeat:apache \

? ? ? ? params configfile="/etc/httpd/conf/httpd.conf" \

? ? ? ? op monitor interval="1min"

primitive ClusterIP ocf:heartbeat:IPaddr2 \

? ? ? ? params ip="192.168.122.101" cidr_netmask="32" \

? ? ? ? op monitor interval="30s"

primitive dlm ocf:pacemaker:controld \

? ? ? ? op monitor interval="120s"

primitive gfs-control ocf:pacemaker:controld \創建一個 GFS2 文件系統

61

 params daemon=”gfs_controld.pcmk” args=”-g 0” \

 op monitor interval="120s"

ms WebDataClone WebData \

? ? ? ? meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"

clone dlm-clone dlm \

? ? ? ? meta interleave="true"

clone gfs-clone gfs-control \

 meta interleave="true"

location prefer-pcmk-1 WebSite 50: pcmk-1

colocation WebSite-with-WebFS inf: WebSite WebFS

colocation fs_on_drbd inf: WebFS WebDataClone:Master

colocation gfs-with-dlm inf: gfs-clone dlm-clone

colocation website-with-ip inf: WebSite ClusterIP

order WebFS-after-WebData inf: WebDataClone:promote WebFS:start

order WebSite-after-WebFS inf: WebFS WebSite

order apache-after-ip inf: ClusterIP WebSite

order start-gfs-after-dlm inf: dlm-clone gfs-clone

property $id="cib-bootstrap-options" \

? ? ? ? dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \

? ? ? ? cluster-infrastructure="openais" \

? ? ? ? expected-quorum-votes=”2” \

? ? ? ? stonith-enabled="false" \

? ? ? ? no-quorum-policy="ignore"

rsc_defaults $id="rsc-options" \

? ? ? ? resource-stickiness=”100”

crm(gfs-glue)# cib commit gfs-glue

INFO: commited 'gfs-glue' shadow CIB to the cluster

crm(gfs-glue)# quit

bye

[root@pcmk-1 ~]# crm_mon

============

Last updated: Thu Sep ?3 20:49:54 2009

Stack: openais

Current DC: pcmk-2 - partition with quorum

Version: 1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f

2 Nodes configured, 2 expected votes

6 Resources configured.

============

Online: [ pcmk-1 pcmk-2 ]

WebSite (ocf::heartbeat:apache): ? ? ? ?Started pcmk-2

Master/Slave Set: WebDataClone

? ? ? ? Masters: [ pcmk-1 ]

? ? ? ? Slaves: [ pcmk-2 ]

ClusterIP? ? ? ? (ocf::heartbeat:IPaddr): ? ? ? ?Started pcmk-2

Clone Set: dlm-clone

? ? ? ? Started: [ pcmk-2 pcmk-1 ]

Clone Set: gfs-clone

 Started: [ pcmk-2 pcmk-1 ]

WebFS ? (ocf::heartbeat:Filesystem): ? ?Started pcmk-1

8.4.?創建一個 GFS2 文件系統

8.4.1.?準備工作

在我們對一個已存在的分區做任何操作之前,我們要確保它沒有被掛載。我們告訴集羣停止WebFS這個

資源來確保這一點。這可以確保其他使用WebFS的資源會正確的依次關閉。

[root@pcmk-1 ~]# crm_resource --resource WebFS --set-parameter target-role --meta --parameter-value Stopped

[root@pcmk-1 ~]# crm_mon

============第?8?章?轉變爲Active/Active

62

Last updated: Thu Sep ?3 15:18:06 2009

Stack: openais

Current DC: pcmk-1 - partition with quorum

Version: 1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f

2 Nodes configured, 2 expected votes

6 Resources configured.

============

Online: [ pcmk-1 pcmk-2 ]

Master/Slave Set: WebDataClone

? ? ? ? Masters: [ pcmk-1 ]

? ? ? ? Slaves: [ pcmk-2 ]

ClusterIP? ? ? ? (ocf::heartbeat:IPaddr):? ? ? ? Started pcmk-1

Clone Set: dlm-clone

? ? ? ? Started: [ pcmk-2 pcmk-1 ]

Clone Set: gfs-clone

? ? ? ? Started: [ pcmk-2 pcmk-1 ]

注意

注意 Apache and WebFS 兩者都已經停止了。

8.4.2.?創建並遷移數據到 GFS2 分區

現在集羣的基層和集成部分都正常運行,我們現在創建一個GFS2分區

警告

這個操作會清除DRBD分區上面的所有數據,請備份重要的數據。

我們要爲GFS2分區指定一系列附加的參數。

首先我們要用 -p選項來指定我們用的是內核的DLM,然後我們用-j來表示我們爲兩個日誌保留足夠的空

間(每個操作文件系統的節點各一個)。

最後,我們用-t來指定lock table的名稱。這個字段的格式是 clustername:fsname(集羣名稱:文件系

統名稱)。fsname的話,我們只要用一個唯一的並且能描述我們這個集羣的名稱就好了,我們用默認的

pcmk。

如果要更改集羣的名稱,找到包含name:pacemaker的配置文件區域,然後添加如下所示的選項即可。

clustername: myname

在每個節點都執行以下命令。

[root@pcmk-1 ~]# mkfs.gfs2 -p lock_dlm -j 2 -t pcmk:web /dev/drbd1

This will destroy any data on /dev/drbd1.

It appears to contain: data

Are you sure you want to proceed? [y/n] y

Device: ? ? ? ? ? ? ? ? ? ?/dev/drbd1

Blocksize: ? ? ? ? ? ? ? ? 4096

Device Size ? ? ? ? ? ? ? ?1.00 GB (131072 blocks)

Filesystem Size: ? ? ? ? ? 1.00 GB (131070 blocks)8.5. 重新爲集羣配置GFS2

63

Journals: ? ? ? ? ? ? ? ? ?2

Resource Groups: ? ? ? ? ? 2

Locking Protocol: ? ? ? ? ?"lock_dlm"

Lock Table: ? ? ? ? ? ? ? ?"pcmk:web"

UUID: ? ? ? ? ? ? ? ? ? ? ?6B776F46-177B-BAF8-2C2B-292C0E078613

[root@pcmk-1 ~]#

然後再遷移數據到這個新的文件系統。現在我們創建一個跟上次不一樣的主頁。

[root@pcmk-1 ~]# mount /dev/drbd1 /mnt/

[root@pcmk-1 ~]# cat <<-END >/mnt/index.html

<html>

<body>My Test Site - GFS2</body>

</html>

END

[root@pcmk-1 ~]# umount /dev/drbd1

[root@pcmk-1 ~]# drbdadm?verify?wwwdata

[root@pcmk-1 ~]#

8.5.?8.5. 重新爲集羣配置GFS2

[root@pcmk-1 ~]# crm

crm(live)# cib new GFS2

INFO: GFS2 shadow CIB created

crm(GFS2)# configure delete WebFS

crm(GFS2)# configure primitive WebFS ocf:heartbeat:Filesystem params device="/dev/drbd/by-res/wwwdata"

 directory="/var/www/html" fstype=”gfs2”

現在我們重新創建這個資源, 我們也要重建跟這個資源相關的約束條件,因爲shell會自動刪除跟

WebFS相關的約束條件。

crm(GFS2)# configure colocation WebSite-with-WebFS inf: WebSite WebFS

crm(GFS2)# configure colocation fs_on_drbd inf: WebFS WebDataClone:Master

crm(GFS2)# configure order WebFS-after-WebData inf: WebDataClone:promote WebFS:start

crm(GFS2)# configure order WebSite-after-WebFS inf: WebFS WebSite

crm(GFS2)# configure colocation WebFS-with-gfs-control INFINITY: WebFS gfs-clone

crm(GFS2)# configure order start-WebFS-after-gfs-control mandatory: gfs-clone WebFS

crm(GFS2)# configure show

node pcmk-1

node pcmk-2

primitive WebData ocf:linbit:drbd \

? ? ? ? params drbd_resource="wwwdata" \

? ? ? ? op monitor interval="60s"

primitive WebFS ocf:heartbeat:Filesystem \

 params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype=”gfs2”

primitive WebSite ocf:heartbeat:apache \

? ? ? ? params configfile="/etc/httpd/conf/httpd.conf" \

? ? ? ? op monitor interval="1min"

primitive ClusterIP ocf:heartbeat:IPaddr2 \

? ? ? ? params ip="192.168.122.101" cidr_netmask="32" \

? ? ? ? op monitor interval="30s"

primitive dlm ocf:pacemaker:controld \

? ? ? ? op monitor interval="120s"

primitive gfs-control ocf:pacemaker:controld \

? ?params daemon=”gfs_controld.pcmk” args=”-g 0” \

? ? ? ? op monitor interval="120s"

ms WebDataClone WebData \

? ? ? ? meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"第?8?章?轉變爲Active/Active

64

clone dlm-clone dlm \

? ? ? ? meta interleave="true"

clone gfs-clone gfs-control \

? ? ? ? meta interleave="true"

colocation WebFS-with-gfs-control inf: WebFS gfs-clone

colocation WebSite-with-WebFS inf: WebSite WebFS

colocation fs_on_drbd inf: WebFS WebDataClone:Master

colocation gfs-with-dlm inf: gfs-clone dlm-clone

colocation website-with-ip inf: WebSite ClusterIP

order WebFS-after-WebData inf: WebDataClone:promote WebFS:start

order WebSite-after-WebFS inf: WebFS WebSite

order apache-after-ip inf: ClusterIP WebSite

order start-WebFS-after-gfs-control inf: gfs-clone WebFS

order start-gfs-after-dlm inf: dlm-clone gfs-clone

property $id="cib-bootstrap-options" \

? ? ? ? dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \

? ? ? ? cluster-infrastructure="openais" \

? ? ? ? expected-quorum-votes=”2” \

? ? ? ? stonith-enabled="false" \

? ? ? ? no-quorum-policy="ignore"

rsc_defaults $id="rsc-options" \

? ? ? ? resource-stickiness=”100”

看看配置文件有沒有錯誤,然後退出shell看看集羣的反應。

crm(GFS2)# cib commit GFS2

INFO: commited 'GFS2' shadow CIB to the cluster

crm(GFS2)# quit

bye

[root@pcmk-1 ~]# crm_mon

============

Last updated: Thu Sep ?3 20:49:54 2009

Stack: openais

Current DC: pcmk-2 - partition with quorum

Version: 1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f

2 Nodes configured, 2 expected votes

6 Resources configured.

============

Online: [ pcmk-1 pcmk-2 ]

WebSite (ocf::heartbeat:apache): ? ? ? ?Started pcmk-2

Master/Slave Set: WebDataClone

? ? ? ? Masters: [ pcmk-1 ]

? ? ? ? Slaves: [ pcmk-2 ]

ClusterIP? ? ? ? (ocf::heartbeat:IPaddr): ? ? ? ?Started pcmk-2

Clone Set: dlm-clone

? ? ? ? Started: [ pcmk-2 pcmk-1 ]

Clone Set: gfs-clone

? ? ? ? Started: [ pcmk-2 pcmk-1 ]

WebFS (ocf::heartbeat:Filesystem): Started pcmk-1

8.6.?重新配置 Pacemaker 爲 Active/Active

基本上所有的事情都已經準備就緒了。最新的DRBD是支持 Primary/Primary(主/主)模式的,並且我們

的文件系統的是針對集羣的。所有我們要做的事情就是重新配置我們的集羣來使用它們(的先進功能)。

這次操作會改很多東西,所以我們再次使用交互模式

[root@pcmk-1 ~]# crm

[root@pcmk-1 ~]# cib new active重新配置 Pacemaker 爲 Active/Active

65

如果我們不能訪問這些服務,那做成 Active/Active是沒有必要的,所以我們要先clone這個IP地址,

克隆的IPaddr2資源用的是iptables規則來保證每個請求都只由一個節點來處理。附件的meta選項告訴

集羣我們要克隆多少個實例(每個節點一個"請求桶")。並且如果其他節點掛了,剩下的節點可以處理所

有的請求。否則這些請求都會被丟棄。

[root@pcmk-1 ~]# configure clone WebIP ClusterIP ?\

? ? ? ? meta globally-unique=”true” clone-max=”2” clone-node-max=”2”

現在我們要告訴集羣如何決定請求怎樣分配給節點。我們要設置 clusterip_hash這個參數來實現它。

打開ClusterIP的配置

[root@pcmk-1 ~]# configure edit ?ClusterIP

在參數行添加以下內容:

clusterip_hash="sourceip"

完整的定義就像下面一樣:

primitive ClusterIP ocf:heartbeat:IPaddr2 \ 

? ? ? ? params ip="192.168.122.101" cidr_netmask="32" clusterip_hash="sourceip" \

? ? ? ? op monitor interval="30s"

以下是完整的配置

[root@pcmk-1 ~]# crm 

crm(live)# cib new active

INFO: active shadow CIB created

crm(active)# configure clone WebIP ClusterIP ?\

? ? ? ? meta globally-unique=”true” clone-max=”2” clone-node-max=”2”

crm(active)# configure show

node pcmk-1

node pcmk-2

primitive WebData ocf:linbit:drbd \

? ? ? ? params drbd_resource="wwwdata" \

? ? ? ? op monitor interval="60s"

primitive WebFS ocf:heartbeat:Filesystem \

? ? ? ? params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype=”gfs2”

primitive WebSite ocf:heartbeat:apache \

? ? ? ? params configfile="/etc/httpd/conf/httpd.conf" \

? ? ? ? op monitor interval="1min"

primitive ClusterIP ocf:heartbeat:IPaddr2 \

? ? ? ? params ip=”192.168.122.101” cidr_netmask=”32” clusterip_hash=”sourceip” \

? ? ? ? op monitor interval="30s"

primitive dlm ocf:pacemaker:controld \

? ? ? ? op monitor interval="120s"

primitive gfs-control ocf:pacemaker:controld \

? ?params daemon=”gfs_controld.pcmk” args=”-g 0” \

? ? ? ? op monitor interval="120s"

ms WebDataClone WebData \

? ? ? ? meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"

clone WebIP ClusterIP \

 meta globally-unique=”true” clone-max=”2” clone-node-max=”2”

clone dlm-clone dlm \

? ? ? ? meta interleave="true"

clone gfs-clone gfs-control \第?8?章?轉變爲Active/Active

66

? ? ? ? meta interleave="true"

colocation WebFS-with-gfs-control inf: WebFS gfs-clone

colocation WebSite-with-WebFS inf: WebSite WebFS

colocation fs_on_drbd inf: WebFS WebDataClone:Master

colocation gfs-with-dlm inf: gfs-clone dlm-clone

colocation website-with-ip inf: WebSite WebIP

order WebFS-after-WebData inf: WebDataClone:promote WebFS:start

order WebSite-after-WebFS inf: WebFS WebSite

order apache-after-ip inf: WebIP WebSite

order start-WebFS-after-gfs-control inf: gfs-clone WebFS

order start-gfs-after-dlm inf: dlm-clone gfs-clone

property $id="cib-bootstrap-options" \

? ? ? ? dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \

? ? ? ? cluster-infrastructure="openais" \

? ? ? ? expected-quorum-votes=”2” \

? ? ? ? stonith-enabled="false" \

? ? ? ? no-quorum-policy="ignore"

rsc_defaults $id="rsc-options" \

? ? ? ? resource-stickiness=”100”

請注意所有跟ClusterIP相關的限制都已經被更新到與WebIP相關,這是使用crm shell的另一個好處。

然後我們要把文件系統和apache資源變成clones。同樣的 crm shell會自動更新相關約束。

crm(active)# configure clone WebFSClone WebFS

crm(active)# configure clone WebSiteClone WebSite

最後要告訴集羣現在允許把兩個節點都提升爲 Primary(換句話說 Master).

crm(active)# configure edit WebDataClone

把 master-max 改爲 2

crm(active)# configure show

node pcmk-1

node pcmk-2

primitive WebData ocf:linbit:drbd \

? ? ? ? params drbd_resource="wwwdata" \

? ? ? ? op monitor interval="60s"

primitive WebFS ocf:heartbeat:Filesystem \

? ? ? ? params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype=”gfs2”

primitive WebSite ocf:heartbeat:apache \

? ? ? ? params configfile="/etc/httpd/conf/httpd.conf" \

? ? ? ? op monitor interval="1min"

primitive ClusterIP ocf:heartbeat:IPaddr2 \

? ? ? ? params ip=”192.168.122.101” cidr_netmask=”32” clusterip_hash=”sourceip” \

? ? ? ? op monitor interval="30s"

primitive dlm ocf:pacemaker:controld \

? ? ? ? op monitor interval="120s"

primitive gfs-control ocf:pacemaker:controld \

? ?params daemon=”gfs_controld.pcmk” args=”-g 0” \

? ? ? ? op monitor interval="120s"

ms WebDataClone WebData \

? ? ? ? meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"

clone WebFSClone WebFS

clone WebIP ClusterIP ?\

? ? ? ? meta globally-unique=”true” clone-max=”2” clone-node-max=”2”

clone WebSiteClone WebSite

clone dlm-clone dlm \

? ? ? ? meta interleave="true"恢復測試

67

clone gfs-clone gfs-control \

? ? ? ? meta interleave="true"

colocation WebFS-with-gfs-control inf: WebFSClone gfs-clone

colocation WebSite-with-WebFS inf: WebSiteClone WebFSClone

colocation fs_on_drbd inf: WebFSClone WebDataClone:Master

colocation gfs-with-dlm inf: gfs-clone dlm-clone

colocation website-with-ip inf: WebSiteClone WebIP

order WebFS-after-WebData inf: WebDataClone:promote WebFSClone:start

order WebSite-after-WebFS inf: WebFSClone WebSiteClone

order apache-after-ip inf: WebIP WebSiteClone

order start-WebFS-after-gfs-control inf: gfs-clone WebFSClone

order start-gfs-after-dlm inf: dlm-clone gfs-clone

property $id="cib-bootstrap-options" \

? ? ? ? dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \

? ? ? ? cluster-infrastructure="openais" \

? ? ? ? expected-quorum-votes=”2” \

? ? ? ? stonith-enabled="false" \

? ? ? ? no-quorum-policy="ignore"

rsc_defaults $id="rsc-options" \

? ? ? ? resource-stickiness=”100”

看看配置文件有沒有錯誤,然後退出shell看看集羣的反應。

crm(active)# cib commit active

INFO: commited 'active' shadow CIB to the cluster

crm(active)# quit

bye

[root@pcmk-1 ~]# crm_mon

============

Last updated: Thu Sep ?3 21:37:27 2009

Stack: openais

Current DC: pcmk-2 - partition with quorum

Version: 1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f

2 Nodes configured, 2 expected votes

6 Resources configured.

============

Online: [ pcmk-1 pcmk-2 ]

Master/Slave Set: WebDataClone

? ? ? ? Masters: [ pcmk-1 pcmk-2 ]

Clone Set: dlm-clone

? ? ? ? Started: [ pcmk-2 pcmk-1 ]

Clone Set: gfs-clone

? ? ? ? Started: [ pcmk-2 pcmk-1 ]

Clone Set: WebIP

 Started: [ pcmk-1 pcmk-2 ]

Clone Set: WebFSClone

 Started: [ pcmk-1 pcmk-2 ]

Clone Set: WebSiteClone

 Started: [ pcmk-1 pcmk-2 ]

8.6.1.?恢復測試

注意

TODO: Put one node into standby to demonstrate failover68第?9

69

配置 STONITH

目錄

9.1. 爲什麼需要 STONITH  ..............................................................   69

9.2. 你該用什麼樣的STONITH設備。  .....................................................   69

9.3. 配置STONITH  .....................................................................   69

9.3.1. 例子  .....................................................................   70

9.1.?爲什麼需要 STONITH

STONITH 是爆其他節點的頭( Shoot-The-Other-Node-In-The-Head)的縮寫,它能保護你的數據不被

不正常的節點破壞或是併發寫入。

因爲如果一個節點沒有相應,但並不代表它沒有在操作你的數據,100%保證數據安全的做法就是在允許

另外一個節點操作數據之前,使用STONITH來保證節點真的下線了。

STONITH另外一個用場是在當集羣服務無法停止的時候。這個時候,集羣可以用STONITH來強制使節點下

線,從而可以安全的得在其他地方啓動服務。

9.2.?你該用什麼樣的STONITH設備。 重要的一點是STONITH設備可以讓集羣區分節點故障和網絡故障。

人們常常犯得一個錯誤就是選擇遠程電源開關作爲STONITH設備(比如許多主板自帶的IPMI控制器) 。在

那種情況下,集羣不能分辨節點是真正的下線了,還是網絡無法連通了。

同樣地, 任何依靠可用節點的設備(比如測試用的基於SSH的“設備”)都是不適當的。

9.3.?配置STONITH

1. 找到正確的STONITH驅動: stonith -L

2. 因爲設備的不同, 配置的參數也不一樣。 想看設備所需設置的參數,可以用: stonith -t {type}

-n

希望開發者選擇了合適的名稱,如果不是這樣,你可以在活動的機器上面執行以下命令來獲得更多信息

lrmadmin -M stonith {type} pacemaker

輸出應該是XML格式的文本文件,它包含了更詳細的描述

1. 創建stonith.xml文件 包含了一個原始的源,它定義了資stonith類下面的某個type和這個type所需

的參數。

2. 如果這個設備可以擊殺多個設備並且支持從多個節點連接過來,那我們從這個原始資源創建一個克

隆。

3. 使用cibadmin來更新CIB配置文件:cibadmin -C -o resources --xml-file stonith.xml第?9?章?配置 STONITH

70

9.3.1.?例子

假設我們有一個 包含兩個節點的IBM BladeCenter,控制界面的IP是192.168.122.31,然後我們選擇

 external/ibmrsa作爲驅動,然後配置下面列表當中的參數。

[root@pcmk-1 ~]# stonith -t external/ibmrsa -n

hostname ?ipaddr ?userid ?passwd ?type

假設我們知道管理界面的用戶名和密碼,我們要創建一個STONITH的資源:

[root@pcmk-1 ~]# crm 

crm(live)# cib new stonith

INFO: stonith shadow CIB created

crm(stonith)# configure primitive rsa-fencing stonith::external/ibmrsa \

? ? ? ? params hostname=”pcmk-1 pcmk-2" ipaddr=192.168.122.31 userid=mgmt passwd=abc123 type=ibm \

? ? ? ? op monitor interval="60s"

crm(stonith)# configure clone Fencing rsa-fencing

最後,我們要重新打開之前禁用的STONITH:

crm(stonith)# configure property stonith-enabled="true"

crm(stonith)# configure show

node pcmk-1

node pcmk-2

primitive WebData ocf:linbit:drbd \

? ? ? ? params drbd_resource="wwwdata" \

? ? ? ? op monitor interval="60s"

primitive WebFS ocf:heartbeat:Filesystem \

? ? ? ? params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype=”gfs2”

primitive WebSite ocf:heartbeat:apache \

? ? ? ? params configfile="/etc/httpd/conf/httpd.conf" \

? ? ? ? op monitor interval="1min"

primitive ClusterIP ocf:heartbeat:IPaddr2 \

? ? ? ? params ip=”192.168.122.101” cidr_netmask=”32” clusterip_hash=”sourceip” \

? ? ? ? op monitor interval="30s"

primitive dlm ocf:pacemaker:controld \

? ? ? ? op monitor interval="120s"

primitive gfs-control ocf:pacemaker:controld \

? ?params daemon=”gfs_controld.pcmk” args=”-g 0” \

? ? ? ? op monitor interval="120s"

primitive rsa-fencing stonith::external/ibmrsa \

 params hostname=”pcmk-1 pcmk-2" ipaddr=192.168.122.31 userid=mgmt passwd=abc123 type=ibm \

 op monitor interval="60s"

ms WebDataClone WebData \

? ? ? ? meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"

clone Fencing rsa-fencing 

clone WebFSClone WebFS

clone WebIP ClusterIP ?\

? ? ? ? meta globally-unique=”true” clone-max=”2” clone-node-max=”2”

clone WebSiteClone WebSite

clone dlm-clone dlm \

? ? ? ? meta interleave="true"

clone gfs-clone gfs-control \

? ? ? ? meta interleave="true"

colocation WebFS-with-gfs-control inf: WebFSClone gfs-clone

colocation WebSite-with-WebFS inf: WebSiteClone WebFSClone

colocation fs_on_drbd inf: WebFSClone WebDataClone:Master

colocation gfs-with-dlm inf: gfs-clone dlm-clone

colocation website-with-ip inf: WebSiteClone WebIP

order WebFS-after-WebData inf: WebDataClone:promote WebFSClone:start

order WebSite-after-WebFS inf: WebFSClone WebSiteClone例子

71

order apache-after-ip inf: WebIP WebSiteClone

order start-WebFS-after-gfs-control inf: gfs-clone WebFSClone

order start-gfs-after-dlm inf: dlm-clone gfs-clone

property $id="cib-bootstrap-options" \

? ? ? ? dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \

? ? ? ? cluster-infrastructure="openais" \

? ? ? ? expected-quorum-votes=”2” \

? ? ? ? stonith-enabled="true" \

? ? ? ? no-quorum-policy="ignore"

rsc_defaults $id="rsc-options" \

? ? ? ? resource-stickiness=”100”7273

附錄?A.?配置扼要重述

目錄

A.1. 最終的集羣配置文件  ..............................................................   73

A.2. 節點列表  ........................................................................   74

A.3. 集羣選項  ........................................................................   74

A.4. 資源  ............................................................................   74

A.4.1. 默認選項  .................................................................   74

A.4.2. 隔離  .....................................................................   75

A.4.3. 服務地址  .................................................................   75

A.4.4. 分佈式鎖控制器  ...........................................................   75

A.4.5. GFS 控制守護進程  .........................................................   75

A.4.6. DRBD - 共享存儲  ..........................................................  76

A.4.7. 集羣文件系統  .............................................................   76

A.4.8. Apache  ...................................................................   76

A.1.?最終的集羣配置文件

[root@pcmk-1 ~]# crm configure show

node pcmk-1

node pcmk-2

primitive WebData ocf:linbit:drbd \

? ? ? ? params drbd_resource="wwwdata" \

? ? ? ? op monitor interval="60s"

primitive WebFS ocf:heartbeat:Filesystem \

? ? ? ? params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype=”gfs2”

primitive WebSite ocf:heartbeat:apache \

? ? ? ? params configfile="/etc/httpd/conf/httpd.conf" \

? ? ? ? op monitor interval="1min"

primitive ClusterIP ocf:heartbeat:IPaddr2 \

? ? ? ? params ip=”192.168.122.101” cidr_netmask=”32” clusterip_hash=”sourceip” \

? ? ? ? op monitor interval="30s"

primitive dlm ocf:pacemaker:controld \

? ? ? ? op monitor interval="120s"

primitive gfs-control ocf:pacemaker:controld \

? ?params daemon=”gfs_controld.pcmk” args=”-g 0” \

? ? ? ? op monitor interval="120s"

primitive rsa-fencing stonith::external/ibmrsa \

? ? ? ? params hostname=”pcmk-1 pcmk-2" ipaddr=192.168.122.31 userid=mgmt passwd=abc123 type=ibm \

? ? ? ? op monitor interval="60s"

ms WebDataClone WebData \

? ? ? ? meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"

clone Fencing rsa-fencing 

clone WebFSClone WebFS

clone WebIP ClusterIP ?\

? ? ? ? meta globally-unique=”true” clone-max=”2” clone-node-max=”2”

clone WebSiteClone WebSite

clone dlm-clone dlm \

? ? ? ? meta interleave="true"

clone gfs-clone gfs-control \

? ? ? ? meta interleave="true"

colocation WebFS-with-gfs-control inf: WebFSClone gfs-clone

colocation WebSite-with-WebFS inf: WebSiteClone WebFSClone

colocation fs_on_drbd inf: WebFSClone WebDataClone:Master

colocation gfs-with-dlm inf: gfs-clone dlm-clone

colocation website-with-ip inf: WebSiteClone WebIP附錄?A.?配置扼要重述

74

order WebFS-after-WebData inf: WebDataClone:promote WebFSClone:start

order WebSite-after-WebFS inf: WebFSClone WebSiteClone

order apache-after-ip inf: WebIP WebSiteClone

order start-WebFS-after-gfs-control inf: gfs-clone WebFSClone

order start-gfs-after-dlm inf: dlm-clone gfs-clone

property $id="cib-bootstrap-options" \

? ? ? ? dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \

? ? ? ? cluster-infrastructure="openais" \

? ? ? ? expected-quorum-votes=”2” \

? ? ? ? stonith-enabled=”true” \

? ? ? ? no-quorum-policy="ignore"

rsc_defaults $id="rsc-options" \

? ? ? ? resource-stickiness=”100”

A.2.?節點列表 這個列表中的集羣節點是集羣自動添加的。

node pcmk-1

node pcmk-2

A.3.?集羣選項 這是集羣自動存儲集羣信息的地方

1. dc-version - DC使用的Pacemaker的版本(包括源代碼的hash)

2. 集羣-基層 - 集羣使用的基層軟件 (heartbeat or openais/corosync)

3. expected-quorum-votes - 預期的集羣最大成員數

以及管理員設置集羣操作的方法選項

1. stonith-enabled=true - 使用STONITH

2. no-quorum-policy=ignore - 忽略達不到法定人數的情況,繼續運行資源

property $id="cib-bootstrap-options" \

? ? ? ? dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \

? ? ? ? cluster-infrastructure="openais" \

? ? ? ? expected-quorum-votes=”2” \

? ? ? ? stonith-enabled=”true” \

? ? ? ? no-quorum-policy="ignore"

A.4.?資源

A.4.1.?默認選項

這裏我們設置所有資源共用的集羣選項

1. resource-stickiness - 資源粘稠值

rsc_defaults $id="rsc-options" \

? ? ? ? resource-stickiness=”100”隔離

75

A.4.2.?隔離

注意

TODO: Add text here

primitive rsa-fencing stonith::external/ibmrsa \

? ? ? ? params hostname=”pcmk-1 pcmk-2" ipaddr=192.168.122.31 userid=mgmt passwd=abc123 type=ibm \

? ? ? ? op monitor interval="60s"

clone Fencing rsa-fencing

A.4.3.?服務地址

用戶需要一個不變的地址來訪問集羣所提供的服務。此外,我們clone了地址,以便在兩個節點上都使

用這個IP。一個iptables規則(resource agent的一部分)是用來確保每個請求只能由兩個節點中的某

一個處理。這些額外的集羣選項告訴我們想要兩個clone(每個節點一個“請求桶”)實例,如果一個

節點失效,那麼剩下的節點處理這兩個請求桶。

primitive ClusterIP ocf:heartbeat:IPaddr2 \

? ? ? ? params ip=”192.168.122.101” cidr_netmask=”32” clusterip_hash=”sourceip” \

? ? ? ? op monitor interval="30s"

clone WebIP ClusterIP ?

? ? ? ? meta globally-unique=”true” clone-max=”2” clone-node-max=”2”

注意

TODO: The RA should check for globally-unique=true when cloned

A.4.4.?分佈式鎖控制器

像GFS2集羣文件系統需要一個鎖管理。該服務啓動守護進程,提供了訪問內核中的鎖管理器的用戶空間

應用程序(如GFS2守護進程)。因爲我們需要它在集羣中的所有可用節點中運行,我們把它clone。

primitive dlm ocf:pacemaker:controld \

? ? ? ? op monitor interval="120s"

clone dlm-clone dlm \

? ? ? ? meta interleave="true

注意

TODO: Confirm interleave is no longer needed

A.4.5.?GFS 控制守護進程

GFS2還需要一個user-space到kernel的橋樑,每個節點上要運行。所以在這裏我們還有一個clone,

但是這一次我們還必須指定它只能運行在有DLM的機器上(colocation 約束),它只能在DLM後啓

動 (order約束)。此外,gfs-control clone應該只關係與其配對的DLM實例,所以我們還要設置

interleave 選項附錄?A.?配置扼要重述

76

primitive gfs-control ocf:pacemaker:controld \

? ?params daemon=”gfs_controld.pcmk” args=”-g 0” \

? ? ? ? op monitor interval="120s"

clone gfs-clone gfs-control \

? ? ? ? meta interleave="true"

colocation gfs-with-dlm inf: gfs-clone dlm-clone

order start-gfs-after-dlm inf: dlm-clone gfs-clone

A.4.6.?DRBD - 共享存儲

在這裏,我們定義了DRBD技術服務,並指定DRBD應該管理的資源(從drbd.conf)。我們讓它作爲主/從

資源,並且爲了active/active,用設置master-max=2來允許兩者都晉升爲master。我們還可以設置通

知選項,這樣,當時集羣的節點的狀態發生改變時,該集羣將告訴DRBD的agent。

primitive WebData ocf:linbit:drbd \

? ? ? ? params drbd_resource="wwwdata" \

? ? ? ? op monitor interval="60s"

ms WebDataClone WebData \

? ? ? ? meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"

A.4.7.?集羣文件系統

羣集文件系統可確保文件讀寫正確。我們需要指定我們想掛載並使用GFS2的塊設備(由DRBD提供)。這

又是一個clone,因爲它的目的是在兩個節點上都可用。這些額外的限制確保它只在有gfs-control和

drbd 實例的節點上運行。

primitive WebFS ocf:heartbeat:Filesystem \

? ? ? ? params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype=”gfs2”

clone WebFSClone WebFS

colocation WebFS-with-gfs-control inf: WebFSClone gfs-clone

colocation fs_on_drbd inf: WebFSClone WebDataClone:Master

order WebFS-after-WebData inf: WebDataClone:promote WebFSClone:start

order start-WebFS-after-gfs-control inf: gfs-clone WebFSClone

A.4.8.?Apache

最後我們有了真正的服務,Apache,我們只需要告訴集羣在哪裏可以找到它的主配置文件,並限制其只

在掛載了文件系統和有可用IP節點上運行

primitive WebSite ocf:heartbeat:apache \

? ? ? ? params configfile="/etc/httpd/conf/httpd.conf" \

? ? ? ? op monitor interval="1min"

clone WebSiteClone WebSite

colocation WebSite-with-WebFS inf: WebSiteClone WebFSClone

colocation website-with-ip inf: WebSiteClone WebIP

order apache-after-ip inf: WebIP WebSiteClone

order WebSite-after-WebFS inf: WebFSClone WebSiteClone77

附錄?B.?Sample Corosync Configuration

例?B.1.?Sample Corosync.conf for a two-node cluster

 

# Please read the Corosync.conf.5 manual page

compatibility: whitetank

totem {

? ? ? ? version: 2

? ? ? ? # How long before declaring a token lost (ms)

? ? ? ? token: ? ? ? ? ?5000

? ? ? ? # How many token retransmits before forming a new configuration

? ? ? ? token_retransmits_before_loss_const: 10

? ? ? ? # How long to wait for join messages in the membership protocol (ms)

? ? ? ? join: ? ? ? ? ? 1000

? ? ? ? # How long to wait for consensus to be achieved before starting a new

? ? ? ? # round of membership configuration (ms)

? ? ? ? consensus: ? ? ?6000

? ? ? ? # Turn off the virtual synchrony filter

? ? ? ? vsftype: ? ? ? ?none

? ? ? ? # Number of messages that may be sent by one processor on receipt of the token

? ? ? ? max_messages: ? 20

? ? ? ? # Stagger sending the node join messages by 1..send_join ms

? ? ? ? send_join: 45

? ? ? ? # Limit generated nodeids to 31-bits (positive signed integers)

? ? ? ? clear_node_high_bit: yes

? ? ? ? # Disable encryption

? ? ? ? secauth:? ? ? ? off

? ? ? ? # How many threads to use for encryption/decryption

? ? ? ? threads: ? ? ? ? ? 0

? ? ? ? # Optionally assign a fixed node id (integer)

? ? ? ? # nodeid: ? ? ? ? 1234

? ? ? ? interface {

? ? ? ? ? ? ? ? ringnumber: 0

? ? ? ? ? ? ? ? # The following values need to be set based on your environment

? ? ? ? ? ? ? ? bindnetaddr: 192.168.122.0

? ? ? ? ? ? ? ? mcastaddr: 226.94.1.1

? ? ? ? ? ? ? ? mcastport: 4000

? ? ? ? }

}

logging {

? ? ? ? debug: off

? ? ? ? fileline: off

? ? ? ? to_syslog: yes

? ? ? ? to_stderr: off

? ? ? ? syslog_facility: daemon

? ? ? ? timestamp: on

}附錄?B.?Sample Corosync Configuration

78

amf {

? ? ? ? mode: disabled

}

 79

附錄?C.?Using CMAN for Cluster

Membership and Quorum

目錄

C.1. Background  ......................................................................   79

C.2. Adding CMAN Support  ............................................................   79

C.2.1. Adding CMAN Support - cluster.conf .......................................  79

C.2.2. Adding CMAN Support - corosync.conf ......................................  80

C.1.?Background

CMAN v31

 is a Corsync plugin that monitors the names and number of active cluster nodes

in order to deliver membership and quorum information to clients (such as the Pacemaker

daemons).

In a traditional Corosync-Pacemaker cluster, a Pacemaker plugin is loaded to provide

membership and quorum information. The motivation for wanting to use CMAN for this

instead, is to ensure all elements of the cluster stack are making decisions based on the

same membership and quorum data. 2

CMAN has been around longer than Pacemaker and is part of the Red Hat cluster stack, so

it is available and supported by many distributions and other pieces of software (such as

OCFS2 and GFS2). For this reason it makes sense to support it.

C.2.?Adding CMAN Support

警告

Be sure to disable the Pacemaker plugin before continuing with this section. In most

cases, this can be achieved by removing /etc/corosync/service.d/pcmk and stopping

Corosync.

C.2.1.?Adding CMAN Support - cluster.conf

The preferred approach for enabling CMAN is to configure cluster.conf and use the /etc/

init.d/cman script to start Corosync. Its far easier to maintain and start automatically

starts the necessary pieces for using GFS2.

You can find some documentation on Installing CMAN and Creating a Basic Cluster

Configuration File3

 at the Red Hat website. However please ignore the parts about Fencing,

1

 http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html-single/Cluster_Suite_Overview/index.html#s2-

clumembership-overview-CSO

2

 A failure to do this can lead to what is called internal split-brain - a situation where different parts of the

stack disagree about whether some nodes are alive or dead - which quickly leads to unnecssary down-time and/or

data corruption.

3

 http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/s1-creating-cluster-

cli-CA.html附錄?C.?Using CMAN for Cluster Membership and Quorum

80

Failover Domains, or HA Services and anything to do with rgmanager and fenced. All these

continue to be handled by Pacemaker in the normal manner.

例?C.1.?Sample cluster.conf for a two-node cluster

 

<?xml version="1.0"?>

<cluster config_version="1" name="beekhof">

  <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>

  <clusternodes>

    <clusternode name="pcmk-1" nodeid="1">

      <fence/>

    </clusternode>

    <clusternode name="pcmk-2" nodeid="2">

      <fence/>

    </clusternode>

  </clusternodes>

  <cman/>

  <fencedevices/>

  <rm/>

</cluster>

 

C.2.2.?Adding CMAN Support - corosync.conf

The alternative is to add the necessary cman configuration elements to corosync.conf. We

recommend you place these directives in /etc/corosync/service.d/cman as they will differ

between machines.

If you choose this approach, you would continue to start and stop Corosync with it's init

script as previously described in this document.

例?C.2.?Sample corosync.conf extensions for a two-node cluster

[root@pcmk-1 ~]# cat <<-END >>/etc/corosync/service.d/cman

cluster {

    name: beekhof

    clusternodes {

            clusternode {

                    votes: 1

                    nodeid: 1

                    name: pcmk-1

            }

            clusternode {

                    votes: 1

                    nodeid: 2

                    name: pcmk-2

            }

    }

    cman {

            expected_votes: 2

            cluster_id: 123

            nodename: `uname -n`

            two_node: 1

            max_queued: 10

    }

}

service {Adding CMAN Support - corosync.conf

81

    name: corosync_cman

    ver: 0

}

quorum {

    provider: quorum_cman

}

END

警告

Verify that nodename was set appropriately on each host.8283

附錄?D.?延伸閱讀 Project Website

http://www.clusterlabs.org1

Cluster Commands

一個綜合的指南,包含了Novell所寫的集羣命令,可以在這裏被找到:

http://www.novell.com/documentation/sles11/book_sleha/index.html?page=/documentation/

sles11/book_sleha/data/book_sleha.html

Corosync

http://www.corosync.org2

1

 http://www.clusterlabs.org/

2

 http://www.corosync.org/8485

附錄?E.?修訂歷史 修訂 1 Mon May 17 2010 Andrew Beekhof [email protected]

Import from Pages.app8687

索引

F

feedback

contact information for this manual,xi88

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章