DRBD with heartbeat

 

Prerequisites

- Setup Minimal CentOS 5

- be sure that both nodes can resolve correctly names (either through dns or /etc/hosts)

- yum update (as usual ... ;-) )

- yum install heartbeat drbd kmod-drbd (available in the extras repository)

Current situation :

  • node1.yourdomain.org 172.29.156.20/24 , source disc /dev/sdb that will be replicated
  • node2.yourdomain.org 172.29.156.21/24 , target disc /dev/sdb

 

DRBD Configuration

We'll configure DRBD so that /dev/sdb will be replicated from one node to the other (roles can be changed at any time though) The name of the drbd resource will be "repdata" (you can of course use the name you want). Here is the content of the /etc/drbd.conf file :

 

#
# please have a a look at the example configuration file in
# /usr/share/doc/drbd/drbd.conf
#
global { usage-count no; }
resource repdata {
protocol C;
startup { wfc-timeout 0; degr-wfc-timeout 120; }
disk { on-io-error detach; } # or panic, ...
net { cram-hmac-alg "sha1"; shared-secret "Cent0Sru!3z"; } # don't forget to choose a secret for auth !
syncer { rate 10M; }
on node1.yourdomain.org {
device /dev/drbd0;
disk /dev/sdb;
address 172.29.156.20:7788;
meta-disk internal;
}
on node2.yourdomain.org {
device /dev/drbd0;
disk /dev/sdb;
address 172.29.156.21:7788;
meta-disk internal;
}
}

- replicate this config file (/etc/drbd.conf) to the second node

 

scp /etc/drbd.conf root@node2:/etc/

- Initialize the meta-data area on disk before starting drbd (! on both nodes!)

 

[root@node1 etc]# drbdadm create-md repdata
v08 Magic number not found
v07 Magic number not found
About to create a new drbd meta data block on /dev/sdb.
. ==> This might destroy existing data! <==
Do you want to proceed? [need to type 'yes' to confirm] yes
Creating meta data... initialising activity log NOT initialized bitmap (256 KB) New drbd meta data block sucessfully created.

- start drbd on both nodes (service drbd start)

 

[root@node1 etc]# service drbd start
Starting DRBD resources: [ d0 n0 ]. ......
[root@node1 etc]# cat /proc/drbd
version: 8.0.4 (api:86/proto:86) SVN Revision: 2947 build by buildsvn@c5-i386-build, 2007-07-31 19:17:18
. 0: cs:Connected st:Secondary/Secondary ds:Inconsistent/Inconsistent C r---
. ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
. resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0 act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0
[root@node1 etc]# ssh root@node2 cat /proc/drbd version: 8.0.4 (api:86/proto:86) SVN Revision: 2947 build by buildsvn@c5-i386-build, 2007-07-31 19:17:18
. 0: cs:Connected st:Secondary/Secondary ds:Inconsistent/Inconsistent C r---
. ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
. resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0 act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0

As you can see , both nodes are secondary, which is normal. we need to decide which node will act as a primary now (node1) : that will initiate the first 'full sync' between the two nodes :

 

[root@node1 etc]# drbdadm -- --overwrite-data-of-peer primary repdata
[root@node1 etc]# watch -n 1 cat /proc/drbd version: 8.0.4 (api:86/proto:86) SVN Revision: 2947 build by buildsvn@c5-i386-build, 2007-07-31 19:17:18
. 0: cs:SyncTarget st:Primary/Secondary ds:Inconsistent/Inconsistent C r---
. ns:0 nr:68608 dw:68608 dr:0 al:0 bm:4 lo:0 pe:0 ua:0 ap:0
. [>...................] sync'ed: 0.9% (8124/8191)M finish: 0:12:05 speed: 11,432 (11,432) K/sec resync: used:0/31 hits:4283 misses:5 starving:0 dirty:0 changed:5 act_log: used:0/257 hits:0 misses:0 starving:0 dirty:0 changed:0

- we can now format /dev/drbd0 and mount it on node1 : mkfs.ext3 /dev/drbd0 ; mkdir /repdata ; mount /dev/drbd0 /repdata 

- create some fake data on node 1 :

  •  [root@node1 etc]# for i in {1..5};do dd if=/dev/zero of=/repdata/file$i bs=1M count=100;done 

- now switch manually to the second node :

 

[root@node1 /]# umount /repdata ; drbdadm secondary repdata
[root@node2 /]# mkdir /repdata ; drbdadm primary repdata ; mount /dev/drbd0 /repdata
[root@node2 /]# ls /repdata/ file1 file2 file3 file4 file5 lost+found

Great, data was replicated .... now let's delete/add some file :

[root@node2 /]# rm /repdata/file2 ; dd if=/dev/zero of=/repdata/file6 bs=100M count=2

- Now switch back to the first node :

 

[root@node2 /]# umount /repdata/ ; drbdadm secondary repdata
[root@node1 /]# drbdadm primary repdata ; mount /dev/drbd0 /repdata
[root@node1 /]# ls /repdata/ file1 file3 file4 file5 file6 lost+found

OK ... Drbd is working ... let's be sure that it will always be started : chkconfig drbd on

 

Heartbeat V2 Configuration

Let's configure a simple /etc/ha.d/ha.cf file :

 

keepalive 2
deadtime 30
warntime 10
initdead 120
bcast eth0
node node1.yourdomain.org
node node2.yourdomain.org
crm yes

Create also the /etc/ha.d/authkeys (with permissions 600 !!!) :

 

auth 1
1 sha1 MySecret

Start the heartbeat service on node1 :

 

[root@node1 ha.d]# service heartbeat start
Starting High-Availability services: [OK]

Check the cluster status :

[root@node1 ha.d]# crm_mon 

Replicate now the ha.cf and authkeys to node2 and start heartbeat

 

[root@node1 ha.d]# scp /etc/ha.d/ha.cf /etc/ha.d/authkeys root@node2:/etc/ha.d/
[root@node2 ha.d]# service heartbeat start

Verify cluster with crm_mon :

 

=====
Last updated: Wed Sep 12 16:20:39 2007
Current DC: node1.centos.org (6cb712e4-4e4f-49bf-8200-4f15d6bd7385)
2 Nodes configured.
0 Resources configured.
=====
Node: node1.yourdomain.org (6cb712e4-4e4f-49bf-8200-4f15d6bd7385): online
Node: node2.yourdomain.org (f6112aae-8e2b-403f-ae93-e5fd4ac4d27e): online

Note about the gui : you can install heartbeat-gui (yum install heartbeat-gui) on a X workstation and connect to the cluster but you'll need to change the password of the hacluster user account on both nodes ! (or you can use another account but put this one in the haclient group)

We'll know create a resource group that contains an ip address (172.29.156.200) , the drbd device (name repdata) and the filesystem mount operation (mount /dev/drbd0 /repdata) Note : Using a group is easier than using single resources : it will start all the resources from a group in order (ordered=true) and on one node (collocated=true)

Here is the content of /var/lib/heartbeat/crb/cib.xml :

 

 <cib generated="false" admin_epoch="0" epoch="25" num_updates="1" have_quorum="true" ignore_dtd="false" num_peers="0" cib-last-written="Sun Sep 16 19:47:18 2007" cib_feature_revision="1.3" ccm_transition="1">
<configuration>
<crm_config/>
<nodes>
<node id="6cb712e4-4e4f-49bf-8200-4f15d6bd7385" uname="node1.yourdomain.org" type="normal"/>
<node id="f6112aae-8e2b-403f-ae93-e5fd4ac4d27e" uname="node2.yourdomain.org" type="normal"/>
</nodes>
<resources>
<group id="My-DRBD-group" ordered="true" collocated="true">
<primitive id="IP-Addr" class="ocf" type="IPaddr2" provider="heartbeat">
<instance_attributes id="IP-Addr_instance_attrs">
<attributes>
<nvpair id="IP-Addr_target_role" name="target_role" value="started"/>
<nvpair id="2e967596-73fe-444e-82ea-18f61f3848d7" name="ip" value="172.29.156.200"/>
</attributes>
</instance_attributes>
</primitive>
<instance_attributes id="My-DRBD-group_instance_attrs">
<attributes>
<nvpair id="My-DRBD-group_target_role" name="target_role" value="started"/>
</attributes>
</instance_attributes>
<primitive id="DRBD_data" class="heartbeat" type="drbddisk" provider="heartbeat">
<instance_attributes id="DRBD_data_instance_attrs">
<attributes>
<nvpair id="DRBD_data_target_role" name="target_role" value="started"/>
<nvpair id="93d753a8-e69a-4ea5-a73d-ab0d0367f001" name="1" value="repdata"/>
</attributes>
</instance_attributes>
</primitive>
<primitive id="FS_repdata" class="ocf" type="Filesystem" provider="heartbeat">
<instance_attributes id="FS_repdata_instance_attrs">
<attributes>
<nvpair id="FS_repdata_target_role" name="target_role" value="started"/>
<nvpair id="96d659dd-0881-46df-86af-d2ec3854a73f" name="fstype" value="ext3"/>
<nvpair id="8a150609-e5cb-4a75-99af-059ddbfbc635" name="device" value="/dev/drbd0"/>
<nvpair id="de9706e8-7dfb-4505-b623-5f316b1920a3" name="directory" value="/repdata"/>
</attributes>
</instance_attributes>
</primitive>
</group>
</resources>
<constraints>
<rsc_location id="runs_on_pref_node" rsc="My-DRBD-group">
<rule id="prefered_runs_on_pref_node" score="100">
<expression attribute="#uname" id="786ef2b1-4289-4570-8923-4c926025e8fd" operation="eq" value="node1.yourdomain.org"/>
</rule>
</rsc_location>
</constraints>
</configuration>
</cib>

As you can see, we've created a rsc_location constraint so that the cluster resources will start on the prefered node.

You can now move resources through cli (crm_resource) or by using the gui (change the location constraint rule value - for example swithching from node1.yourdomain.org to node2.yourdomain.org and click on apply) . You'll be able to see all resources switching from one node to the other (ip address, drbd and filesystem mount)

 

Firewall considerations

You will need to make sure that the nodes can talk on ports:

DRBD: 7788
HEARTBEAT: 694

 

發佈了220 篇原創文章 · 獲贊 21 · 訪問量 153萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章