RHEL下部署heartbeat，實現簡單故障轉移羣集

實驗環境：RHEL 5.5 64bit

實驗需求：VM虛擬機、heartbeat安裝包

實驗目的：實現兩臺samba服務器之間的自動切換，以及磁盤的共享存儲，達到簡單故障轉移的目的。

實驗規劃：

HOSTA:

hostname：sev1.example.com sev1 eth0:192.168.138.10 eth1：192.168.1.10 （心跳端口） GW:192.168.138.2 主節點

HOSTB:

hostname：sev2.example.com sev2 eth0:192.168.138.20 eth1：192.168.1.20 （心跳端口） GW:192.168.138.2 備用節點

實驗步驟：

1、打開VMware虛擬機，首先安裝2臺虛擬主機，均使用RHEL 5.564bit操作系統。在安裝操作系統的時候注意把samba服務安裝好。（如果等系統安裝好之後再裝samba的話，依賴關係很雜，使用rpm安裝不太方便！）

2、在HOSTA虛擬主機下修改虛擬配置，手動添加一個磁盤做共享，暫時命名爲share，這裏爲了實現2臺機器能自動掛載共享存儲，需修改該磁盤的參數。在VM的根目錄下找的新建的共享磁盤，修改share.vmx文件，添加如下幾行參數：

disk.locking = "FALSE"

diskLib.dataCacheMaxSize=0

diskLib.dataCacheMaxReadAheadSize=0
diskLib.dataCacheMinReadAheadSize=0
diskLib.dataCachePageSize=4096
diskLib.maxUnsyncedWrites=0

scsi0:1.sharedBus = "virtual"（scsi是虛擬設備節點，根據實際情況修改即可）
scsi0:1.shared = "true"

3、啓動HOSTA，用root身份登錄（方便以後操作），打開終端，使用fdisk-l命令查看磁盤，接着格式化該磁盤，這裏我是想使用整個磁盤，所以就不分區，直接格式化成ext3格式，具體命令如下：

fdisk -l 查詢該磁盤“盤符” /dev/sdb

fdisk /dev/sdb m(這裏可以用不同的參數分區，就不多說了，自己百度) 重啓之

終端輸入 mkdir -p /home/share 新建掛載點

mkfs -t ext3 -c /dev/sdb 格式化爲ext3

tips：手動掛載 mount /dev/sdb /home/share測試成功！（記得unmount）

4、HOSTB的配置不需要新建磁盤，直接在添加硬盤的時候選擇已存在的硬盤，指定到share這個磁盤，記得使用新建好掛載點之後要測試下，mount成功即可。

5、配置samba服務器：a、採用終端配置，直接終端輸入vi/etc/samba/smb.conf (主配置文件）。b、圖形化界面配置，路徑爲：管理-->服務器-->samba 。samba配置很簡單，就不多說了，關鍵是要搞懂權限問題。（自己也有點模糊~！）

6、在HOSTA上安裝heartbeat軟件

這裏採用rpm安裝，直接把安裝包CP到虛擬機裏，heartbeat-2.1.3-3版本需要3個包，安裝順序如下：

heartbeat-pils-2.1.3-3.el5.centos.i386.rpm

heartbeat-stonith-2.1.3-3.el5.centos.i386.rpm

heartbeat-2.1.3-3.el5.centos.i386.rpm

安裝方法：先cd到該目錄，ls查看文件，rpm -ivhheartbeat-pils-2.1.3-3.el5.centos.i386.rpm（注意使用tab鍵），根據提示安裝即可。待3個包都安裝好之後，最好rpm -q heartbeat -d 查看安裝了哪些東西，這是一個好習慣哈。

7、heartbeat安裝好之後，在/use/share/doc/heartbeat-2.1.3下找到以下3個文件：authkeys haresources ha.cf 把這三個文件cp到/etc/ha.d 下面。具體配置如下：

a、ha.cf配置：

There are lots of options in this file. Allyou have to have is a set
# of nodes listed {"node ...} one of{serial, bcast, mcast, or ucast},
# and a value for"auto_failback".
# ATTENTION: As the configurationfile is read line by line,
# THE ORDER OF DIRECTIVE MATTERS!
# In particular, make sure that theudpport, serial baud rate
# etc. are set before the heartbeatmedia are defined!
# debug and log file directives gointo effect when they
# are encountered.
# All will be fine if you keep themordered as in this example.
# Note on logging:
# If any of debugfile, logfile andlogfacility are defined then they
# will be used. If debugfile and/orlogfile are not defined and
# logfacility is defined then therespective logging and debug
# messages will be loged to syslog.If logfacility is not defined
# then debugfile and logfile will beused to log messges. If
# logfacility is not defined anddebugfile and/or logfile are not
# defined then defaults will be usedfor debugfile and logfile as
# required and messages will be sentthere.
# File to write debug messagesto
#debugfile /var/log/ha-debug
# File to write other messagesto
logfile /var/log/ha-log
# Facility to use forsyslog()/logger
logfacility local0
# A note on specifying "how long"times below...
# The default time unit isseconds
# 10 means ten seconds
# You can also specify them inmilliseconds
# 1500ms means 1.5 seconds
# keepalive: how long betweenheartbeats?
keepalive 2
# deadtime: howlong-to-declare-host-dead?
# If you set this too low you will get the problematic
# split-brain (or cluster partition) problem.
# See the FAQ for how to use warntime to tune deadtime.
deadtime 60
# warntime: how long before issuing"late heartbeat" warning?
# See the FAQ for how to usewarntime to tune deadtime.
warntime 10
# Very first dead time(initdead)
# On some machines/OSes, etc. thenetwork takes a while to come up
# and start working right afteryou've been rebooted. As a result
# we have a separate dead time forwhen things first come up.
# It should be at least twice thenormal dead time.
initdead 120
# What UDP port to use forbcast/ucast communication?
#
udpport 694
# Baud rate for serial ports...
#baud 19200
# serial serialportname...
#serial /dev/ttyS0 # Linux
#serial /dev/cuaa0 # FreeBSD
#serial /dev/cuad0 # FreeBSD 6.x
#serial /dev/cua/a # Solaris
# What interfaces to broadcastheartbeats over?
bcast eth1 # Linux
#bcast eth1 eth2 # Linux
#bcast le0 # Solaris
#bcast le1 le2 #Solaris
# Set up a multicast heartbeatmedium
# mcast [dev] [mcast group] [port][ttl] [loop]
# [dev] deviceto send/rcv heartbeats on
# [mcast group] multicastgroup to join (class D multicast address
# 224.0.0.0 - 239.255.255.255)
# [port] udp port tosendto/rcvfrom (set this value to the
# same value as "udpport" above)
# [ttl] thettl value for outbound heartbeats. this effects
# how far the multicast packet will propagate. (0-255)
# Must be greater than zero.
# [loop] togglesloopback for outbound multicast heartbeats.
# if enabled, an outbound packet will be looped back and
# received by the interface it was sent on. (0 or 1)
# Set this value to zero.
#mcast eth0 225.0.0.1 694 1 0
# Set up a unicast / udp heartbeatmedium
# ucast [dev] [peer-ip-addr]
# [dev] deviceto send/rcv heartbeats on
# [peer-ip-addr] IP address ofpeer to send packets to
ucast eth1 192.168.1.20
# About boolean values...
# Any of the followingcase-insensitive values will work for true:
# true, on, yes, y, 1
# Any of the followingcase-insensitive values will work for false:
# false, off, no, n, 0
# auto_failback: determineswhether a resource will
# automatically fail back to its"primary" node, or remain
# on whatever node is serving ituntil that node fails, or
# an administrator intervenes.
# The possible values forauto_failback are:
# on - enable automatic failbacks
# off - disable automatic failbacks
# legacy - enable automatic failbacks in systems
# where all nodes do not yet support
# the auto_failback option.
# auto_failback "on" and "off" arebackwards compatible with the old
# "nice_failback on" setting.
# See the FAQ for information on howto convert
# from "legacy" to "on" without a flash cut.
# (i.e., using a "rolling upgrade" process)
# The default value forauto_failback is "legacy", which
# will issue a warning atstartup. So, make sure you put
# an auto_failback directive in yourha.cf file.
# (note: auto_failback can be anyboolean or "legacy")
#
auto_failback on
# Basic STONITH support
# Using this directive assumes thatthere is one stonith
# device in the cluster. Parameters to this device are
# read from a configuration file.The format of this line is:
# stonith
# NOTE: it is up to you to maintainthis file on each node in the
# cluster!
#stonith baytech /etc/ha.d/conf/stonith.baytech
# STONITH support
# You can configure multiple stonithdevices using this directive.
# The format of the line is:
# stonith_host
# is themachine the stonith device is attached
# to or * to mean it is accessible from any host.
# is thetype of stonith device (a list of
# supported drives is in /usr/lib/stonith.)
# are driverspecific parameters. To see the
# format for a particular device, run:
# stonith -l-t
# Note that if you put your stonithdevice access information in
# here, and you make this filepublically readable, you're asking
# for a denial of service attack;-)
# To get a list of supported stonithdevices, run
# stonith -L
# For detailed information on whichstonith devices are supported
# and their detailed configurationoptions, run this command:
# stonith -h
#stonith_host * baytech 10.0.0.3 myloginmysecretpassword
#stonith_host ken3 rps10 /dev/ttyS1 kathy 0
#stonith_host kathy rps10 /dev/ttyS1 ken3 0
# Watchdog is the watchdogtimer. If our own heart doesn't beat for
# a minute, then our machine willreboot.
# NOTE: If you are using thesoftware watchdog, you very likely
# wish to load the module with theparameter "nowayout=0" or
# compile it withoutCONFIG_WATCHDOG_NOWAYOUT set. Otherwise even
# an orderly shutdown of heartbeatwill trigger a reboot, which is
# very likely NOT what you want.
#watchdog /dev/watchdog
# Tell what machines are in thecluster
# node nodename... -- must match uname -n
node sev1.example.com
node sev2.example.com
# Less common options...
# Treats 10.10.10.254 as apsuedo-cluster-member
# Used together with ipfailbelow...
# note: don't use a cluster node asping node
ping 192.168.138.2
# Treats 10.10.10.254 and10.10.10.253 as a psuedo-cluster-member
# called group1. If either10.10.10.254 or 10.10.10.253 are up
# then group1 is up
# Used together with ipfailbelow...
#ping_group group1 10.0.0.1 10.0.0.2
# HBA ping derective for FiberChannel
# Treats fc-card-name aspsudo-cluster-member
# used with ipfail below ...
#
# You can obtain HBAAPI fromhttp://hbaapi.sourceforge.net. Youneed
# to get the library specific toyour HBA directly from the vender
# To install HBAAPI stuff, all Youneed to do is to compile the common
# part you obtained from thesourceforge. This will produce libHBAAPI.so
# which you need to copy to/usr/lib. You need also copy hbaapi.h to
# /usr/include.
# The fc-card-name is the nameobtained from the hbaapitest program
# that is part of the hbaapipackage. Running hbaapitest will produce
# a verbose output. One of the firstline is similar to:
# Apapter number 0 is named: qlogic-qla2200-0
# Here fc-card-name isqlogic-qla2200-0.
#hbaping fc-card-name
# Processes started and stopped withheartbeat. Restarted unless
# they exit with rc=100
#respawn userid /path/name/to/run
#respawn root /usr/lib/heartbeat/ipfail
# Access control for client api
# default is no access
#apiauth client-name gid=gidlist uid=uidlist
#apiauth ipfail gid=root uid=root
###########################
# Unusual options.
###########################
# hopfudge maximum hop count minusnumber of nodes in config
#hopfudge 1
# deadping - dead time for pingnodes
#deadping 30
# hbgenmethod - Heartbeat generationnumber creation method
# Normally these are stored on disk and incremented asneeded.
#hbgenmethod time
# realtime - enable/disable realtimeexecution (high priority, etc.)
# defaults to on
#realtime off
# debug - set debug level
# defaults to zero
#debug 1
# API Authentication - replaces thefifo-permissions-based system of the past
# You can put a uid list and/or agid list.
# If you put both, then a process isauthorized if it qualifies under either
# the uid list, or under the gidlist.
# The groupname "default" hasspecial meaning. If it is specified, then
# this will be used for authorizinggroupless clients, and any client groups
# not otherwise specified.
# There is a subtle exception tothis. "default" will never be used in the
# following cases (actual defaultauth directives noted in brackets)
# ipfail (uid=HA_CCMUSER)
# ccm (uid=HA_CCMUSER)
# ping (gid=HA_APIGROUP)
# cl_status (gid=HA_APIGROUP)
# This is done to avoid creating agaping security hole and matches the most
# likely desired configuration.
#apiauth ipfail uid=hacluster
#apiauth ccm uid=hacluster
#apiauth cms uid=hacluster
#apiauth ping gid=haclient uid=alanr,root
#apiauth default gid=haclient
# message format in the wire, it canbe classic or netstring,
# default: classic
#msgfmt classic/netstring
# Do we use logging daemon?
# If logging daemon is used,logfile/debugfile/logfacility in this file
# are not meaningful any longer. Youshould check the config file for logging
# daemon (the default is/etc/logd.cf)
# more infomartion can be fould inhttp://www.linux-ha.org/ha_2ecf_2fUseLogdDirective
# Setting use_logd to "yes" isrecommended
use_logd yes
# the interval we reconnect tologging daemon if the previous connection failed
# default: 60 seconds
#conn_logd_time 60
# Configure compression module
# It could be zlib or bz2, dependingon whether u have the corresponding
# library in the system.
#compression bz2
# Confiugre compressionthreshold
# This value determines thethreshold to compress a message,
# e.g. if the threshold is 1, thenany message with size greater than 1 KB
# will be compressed, the default is2 (KB)
# compression_threshold 2

b、配置authkeys

# Authenticationfile. Must be mode 600
# Must have exactly one authdirective at the front.
# auth sendauthentication using this method-id
# Then, list the method and key thatgo with that method-id
# Available methods: crc sha1,md5. Crc doesn't need/want a key.
# You normally only have oneauthentication method-id listed in this file
# Put more than one to make a smoothtransition when changing auth
# methods and/or keys.

# sha1 is believedto be the "best", md5 next best.
# crc adds no security, except frompacket corruption.
# Use only on physically secure networks.
auth 1
# Authentication file. Must bemode 600
# Must have exactly one authdirective at the front.
# auth sendauthentication using this method-id
# Then, list the method and key thatgo with that method-id
# Available methods: crc sha1,md5. Crc doesn't need/want a key.
# You normally only have oneauthentication method-id listed in this file
# Put more than one to make a smoothtransition when changing auth
# methods and/or keys.
# sha1 is believed to be the "best",md5 next best.
# crc adds no security, except frompacket corruption.
# Use only on physically secure networks.
auth 1
1 crc
#2 sha1 HI!
#3 md5 Hello!

重點：配置完後要修改authkeys文件權限 chmod 600authkeys（這一步必須做）
c、配置haresources

# This is a list ofresources that move from machine to machine as
# nodes go down and come up in thecluster. Do not include
# "administrative" or fixed IPaddresses in this file.
#
# The haresources files MUST BEIDENTICAL on all nodes of the cluster.
# The node names listed in front ofthe resource group information
# is the name of the preferred nodeto run the service. It is
# not necessarily the name of thecurrent machine. If you are running
# auto_failback ON (or legacy), thenthese services will be started
# up on the preferred nodes - anytime they're up.
# If you are running withauto_failback OFF, then the node information
# will be used in the case of asimultaneous start-up, or when using
# the hb_standby {foreign,local}command.
# BUT FOR ALL OF THESE CASES, theharesources files MUST BE IDENTICAL.
# If your files are different thenalmost certainly something
# won't work right.
#
# We refer to this file when we'recoming up, and when a machine is being
# taken over after going down.
# You need to make this right foryour installation, then install it in
# /etc/ha.d
# Each logical line in the fileconstitutes a "resource group".
# A resource group is a list ofresources which move together from
# one node to another - in the orderlisted. It is assumed that there
# is no relationship betweendifferent resource groups. These
# resource in a resource group arestarted left-to-right, and stopped
# right-to-left. Long lists ofresources can be continued from line
# to line by ending the lines withbackslashes ("\").
# These resources in this file areeither IP addresses, or the name
# of scripts to run to "start" or"stop" the given resource.
# The format is like this:
#node-name resource1 resource2 ... resourceN
sev1.example.com 192.168.138.23 httpd
sev1.example.com 192.168.138.24Filesystem::/dev/sdb::/home/share::ext3 smb
# If the resource name contains an:: in the middle of it, the
# part after the :: is passed to theresource script as an argument.
# Multiple arguments are separatedby the :: delimeter
# In the case of IP addresses, theresource script name IPaddr is
# implied.
# For example, the IP address135.9.8.7 could also be represented
# as IPaddr::135.9.8.7
# THIS IS IMPORTANT!! vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
# The given IP address is directedto an interface which has a route
# to the given address. Thismeans you have to have a net route
# set up outside of theHigh-Availability structure. We don't set it
# up here -- we key off of it.
# The broadcast address for the IPalias that is created to support
# an IP address defaults to thehighest address on the subnet.
# The netmask for the IP alias thatis created defaults to the same
# netmask as the route that itselected in in the step above.
# The base interface for the IPaliasthat is created defaults to the
# same netmask as the route that itselected in in the step above.
# If you want to specify that thisIP address is to be brought up
# on a subnet with a netmask of255.255.255.0, you would specify
# this as IPaddr::135.9.8.7/24 .
# If you wished to tell it that thebroadcast address for this subnet
# was 135.9.8.210, then you wouldspecify that this way:
# IPaddr::135.9.8.7/24/135.9.8.210
# If you wished to tell it that theinterface to add the address to
# is eth0, then you would need tospecify it this way:
# IPaddr::135.9.8.7/24/eth0
# And this way to specify both thebroadcast address and the
# interface:
# IPaddr::135.9.8.7/24/eth0/135.9.8.210
# The IP addresses you list in thisfile are called "service" addresses,
# since they're they're the publiclyadvertised addresses that clients
# use to get at highly availableservices.
# For a hot/standby (n 2-node system with only
# a single service address,
# you will probably only put onesystem name and one IP address in here.
# The name you give the address tois the name of the default "hot"
# system.
# Where the nodename is the name ofthe node which "normally" owns the
# resource. If this machine isup, it will always have the resource
# it is shown as owning.
# The string you put in for nodenamemust match the uname -n name
# of your machine. Dependingon how you have it administered, it could
# be a short name or a FQDN.
#
#-------------------------------------------------------------------
# Simple case: One service address,default subnet and netmask
# No servers that go up and down with the IP address
#just.linux-ha.org 135.9.216.110
#-------------------------------------------------------------------
# Assuming the adminstrativeaddresses are on the same subnet...
# A little more complex case: Oneservice address, default subnet
# and netmask, and you want to startand stop http when you get
# the IP address...
#just.linux-ha.org 135.9.216.110 http
#-------------------------------------------------------------------
# A little more complex case: Threeservice addresses, default subnet
# and netmask, and you want to startand stop http when you get
# the IP address...
#just.linux-ha.org 135.9.216.110135.9.215.111 135.9.216.112 httpd
#-------------------------------------------------------------------
# One service address, with thesubnet, interface and bcast addr
# explicitly defined.
#just.linux-ha.org 135.9.216.3/28/eth0/135.9.216.12 httpd
#-------------------------------------------------------------------
# An example where a sharedfilesystem is to be used.
# Note that multiple aguments arepassed to this script using
# the delimiter '::' to separateeach argument.
#node1 10.0.0.170 Filesystem::/dev/sda1::/data1::ext2
# Regarding the node-names in thisfile:
# They must match the names of thenodes listed in ha.cf, which in turn
# must match the `uname -n` of somenode in the cluster. So they aren't
# virtual in any sense of theword.

8、在HOSTB上配置heartbeat

這裏我採用了比較偷懶的方法，因爲配置和HOSTA一樣，只需要在ha.cf配置裏找的ucast eth1192.168.1.20這一行，把地址改爲192.168.1.10即可，所以我直接用ftp登錄到HOSTA上面，把上面3個配置文件GET一下就OK！

9、啓動heartbeat

HOSTA:終端輸入：service heartbeatstart OK

HOSTB:終端輸入：service heartbeatstart OK

這裏如果配置正確，網絡連通性OK，那麼就會自動虛擬出一個eth0:0網口，即爲heartbeat協商出的虛擬IP。記得使用 ps-ef 命了查看heartbeat的運行狀態哈~~！

打字太累，截圖不好傳，寫這麼多主要是方便自己以後忘記的時候在看看~！本人在虛擬機上測試通過，可以自動切換並啓動smb服務，httpd服務也是出奇測試用的，磁盤掛載也OK，這裏千萬不能在fstab內把磁盤自動掛載上了，必須要heartbeat來掛載，這樣纔有效！、

總結：使用heartbeat來實現故障轉移羣集只是簡單的配置而已，需要注意一下幾點：

1、安裝heartbeat之前要修改主機名，IP等信息，需關注hosts /etc/sysconfig/network等網絡配置文件配置好之後再安裝

2、heartbeat配置主要是ha.cf，需要主要的是添加節點、選擇心跳檢測端口、 ping外網連通性，authkeys只是驗證方式，選擇一種即可，在haresources文件內也只需加入一條要執行的命令就行了！（這條命令是精華，花了偶一個星期，後來才發現註釋裏都有說明，英文不好傷不起啊……）

3、linux下的配置文件裏的註釋很重要，有空一定要多看看，配置起來很有幫助！

4、羣集大致分3種：高可用，負載均衡（貌似故障轉移也屬於負載均衡的哈）和高性能計算，對於大型服務器的部署，這些都是必須的，以後需要多研究！以後不知道還有沒有機會學習veritas和oracle！

RHEL下部署heartbeat，實現簡單故障轉移羣集

我的友情鏈接

Centos6.4 系統軟件安裝與優化，不定時更新

linux下cacti詳細部署指南

windows server 2008組策略應用之統一桌面

linux下Oracle的開啓和關閉步驟

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結