基於heartbeat v1+ldirectord實現LVS的高可用

高可用集羣簡介

高可用集羣，即High Availability Cluster，其主要作用就是實現對服務器的故障檢測和資源切換的自動化，儘量縮短由於服務器停止工作而造成的業務中斷時間。服務器在運行過程中經常會由於計算機硬件或者軟件的原因造成該服務器停止向外提供服務，在高可用集羣軟件的幫助下，備用節點可以自動檢測到主節點的故障並將主節點上的資源轉移過來，並立即向外提供服務，實現業務的自動切換。

HA的框架

最下層是基礎事務層（Messaging Layer），主要用於傳遞集羣事務信息、各節點的心跳信息，也包括了CRM層所有需要傳遞的信息，該類服務啓動之後都監聽在某一廣播地址或組播地址上。在該層上解決的方案有heartbeat，corosync，cman(openais)，這些服務需要運行在每一個節點上。

Membership，主要用於管理這個集羣的拓撲結構，並且將這個拓撲結構共享給上層的CRM做出相應的決策。這個層更多地用於管理當前集羣的成員及成員的角色，包括決定哪一個節點爲DC（Designated Coordinator協調員，一個集羣中僅有一個DC）

Resource Allocation，主要實現對資源的管理，包括定義資源，資源分組，資源約束。監控資源在某一個節點上的運行狀況。其中policy engine用於做出集羣事務策略，這個模塊僅運行在DC上，Transition Engine模塊用於執行policy engine做出的決策。該層的資源管理器（CRM）對資源的具體操作通過LRM（Local Resource Manager）實現，而LRM主要是通過執行/etc/init.d目錄下的腳本完成響應的工作。這層的實現方案有haresources（heartbeat v1），crm(heartbeat v2)，pacemaker (heartbeat v3)，rgmanager 。

配置高可用集羣的注意事項：

1、集羣中的各節點時間必須保持同步。

2、節點與節點之間使用名稱相互通信（配置/etc/hosts實現）

3、ssh密鑰認證實現無障礙通信

4、提供仲裁設備（ping node）

基於heartbeat v1+haresources實現LVS高可用集羣

實驗中LVS的模型爲NAT模型，通過heartbeat v1+haresource對Director實現高可用。使用ldirectord對後方的Real Server進行健康狀況的監控。

實驗環境：

時間服務器：192.168.1.118

2臺Director（node1，node2）：VIP:192.168.1.200，DIP:192.168.2.200

Real Server1：192.168.2.12

Real Server2：192.168.2.6

配置LVS環境

在各Real Server上配置網關

[root@node1 ~]# ip route add default via 192.168.2.200

在Director上開啓轉發功能（主備節點）

[root@vm1 ~]# echo 1 > /proc/sys/net/ipv4/ip_forward

時間同步

時間服務器上：

[root@vm1 ~]# vim /etc/ntp.conf 
restrict 192.168.0.0 mask 255.255.0.0 nomodify notrap   #僅允許本網段的進行時間同步
......
server cn.pool.ntp.org                  #指定更高級別的時間服務器
server 0.cn.pool.ntp.org
server 127.127.1.0                      #若訪問不了前面幾個服務，則使用本地的系統時間作爲標
fudge 127.127.1.0 stratum 10            #準時間提供給客戶端

[root@vm1 ~]# ntpstat
synchronised to NTP server (202.118.1.81) at stratum 3 
   time correct to within 84 ms
   polling server every 512 s

集羣中的各節點：

[root@node1 ~]# vim /etc/ntp.conf 
......
server 192.168.2.8

[root@node1 ha.d]# ntpstat
synchronised to NTP server (192.168.2.8) at stratum 4           #時間已同步

配置/etc/hosts，使雙方能基於主機名相互通信

[root@node2 ~]# vim /etc/hosts
192.168.1.116   node1
192.168.1.117   node2

ssh密鑰認證，雙方無障礙通信

[root@node1 ~]# ssh-keygen -t rsa -P ''
[root@node1 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@node2

[root@node2 ~]# ssh-keygen -t rsa -P ''
[root@node2 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@node1

配置完無障礙通信之後，測試下兩邊時間是否同步

[root@node1 ha.d]# ssh node2 'date'; date
Wed Aug  5 23:49:43 CST 2015
Wed Aug  5 23:49:43 CST 2015

安裝對應的包，編輯配置文件

在集羣的每個節點中安裝這些軟件包。

[root@node1 heartbeat]# yum install perl-TimeDate PyXML libnet net-snmp-libs heartbeat-ldirectord-2.1.4-12.el6.x86_64.rpm 
[root@node2 heartbeat]# rpm -ivh heartbeat-2.1.4-12.el6.x86_64.rpm heartbeat-pils-2.1.4-12.el6.x86_64.rpm heartbeat-stonith-2.1.4-12.el6.x86_64.rpm

heartbeat-ldirectord程序包主要是提供ldirectord程序。ldirectord程序在啓動時自動建立IPVS表，然後監視集羣節點的健康情況，在發現失效節點時將其自動從IPVS表中移除。

主配置文件：/etc/ha.d/ha.cf

認證密鑰：/etc/ha.d/authkeys

用於定義資源的文件：/etc/ha.d/haresources

ldirectord的配置文件：/etc/ha.d/ldirectord.cf

默認情況下/etc/ha.d目錄中沒有上述的這幾個文件，需要複製樣例文件到此目錄下，然後修改/etc/ha.d/authkeys文件的權限爲400或600，若權限大於這個值，heartbeat 將無發啓動。

[root@node1 ha.d]# cp /usr/share/doc/heartbeat-2.1.4/{ha.cf,haresources,authkeys} ./ -p
[root@node1 ha.d]# chmod 600 authkeys
[root@node1 ha.d]# cp /usr/share/doc/heartbeat-ldirectord-2.1.4/ldirectord.cf /etc/ha.d

編輯主配置文件/etc/ha.d/ha.cf（其他都使用默認配置）

logfile /var/log/ha-log                   #日誌文件
keepalive 2                               #每2秒傳遞一次心跳信息
udpport 694                               #各節點之間通信用的端口
mcast eth0 230.0.120.1 694 1 0            #多播地址
auto_failback on                          #自動切回
node    node1                             #各節點定義
node    node2 
ping 192.168.1.1                          #選擇ping節點，仲裁設備
compression     bz2                       #對傳遞的信息進行壓縮
compression_threshold 2                   #超過2K的信息才進行壓縮

編輯authkeys，各節點傳遞信息時需要做認證會用到該文件

[root@node2 heartbeat]# openssl rand -hex 8        #生成隨機碼
4ece364b077efd89
[root@node1 ha.d]# vim authkeys 
#auth 1
#1 crc
#2 sha1 HI!
#3 md5 Hello!
auth 1                        #選擇哪一種加密方式
1 sha1 4ece364b077efd89       #加密算法序號    加密算法    隨機碼

配置ldirectord（/etc/ha.d/ldirectord.cf），#http virtual service下面的參數即定義了一個集羣服務，包括VIP，RIP，LVS模型及調度算法。在定義資源時只需要定義ldirectord服務（不需要ipvsadm）即可，在ldirectord在啓動時會調用ipvsadm完成集羣服務的添加。

# Global Directives               #全局配置，對所有的virtual都有效
checktimeout=3                    #探測real server的超時時長
checkinterval=1                   #每一次探測的時間間隔
#fallback=127.0.0.1:80
autoreload=yes                      #更改了配置文件會自動載入
#logfile="/var/log/ldirectord.log"  #自己維護一個日誌文件
#logfile="local0"                   #使用syslog記錄日誌
#emailalert="[email protected]"
#emailalertfreq=3600                #發送郵件的時間間隔
#emailalertstatus=all
quiescent=yes                       #是否工作在靜默模式下。

#http virtual service
virtual=192.168.1.200:80                 #vip
        real=192.168.2.12:80 masq        #各real server，gate表示DR模型，masq爲NAT模型
        real=192.168.2.6:80 masq
        fallback=127.0.0.1:80 masq
        service=http                     #檢查Real Server健康狀況時使用的協議
        request="test.html"              #檢查時，請求的頁面
        receive="ok"                     #期望頁面中包含的數據
        scheduler=rr                     #調度器
        #persistent=600                  #是否使用持久連接，600是持久時長
        #netmask=255.255.255.255
        protocol=tcp                      #基於TCP協議探測
        checktype=negotiate               #探測方式
        checkport=80                      #探測的端口

配置完成後，不要忘了在各個Real Server上httpd的DocumentRoot目錄中添加測試頁面，測試頁面中包含request指定的信息。若配置了fallback，需要在每個節點上啓動httpd服務，並添加index.html頁面，當所有後方的Real Server都停止服務時，這個頁面能給予用戶提示信息。

各個Real Server
[root@node1 ~]# vim /httpd_dir/test.html 
ok

各集羣節點上
[root@node1 ha.d]# vim /var/www/html/index.html
<h1>Sorry</h1>

在/etc/ha.d/haresources文件中定義資源

格式：

主節點 IP/mask/iface resource #主節點後面的都爲資源，資源與資源之間用空格或tab隔開。

[root@node2 ~]# vim /etc/ha.d/haresources
node1   192.168.1.200/24/eth0   192.168.2.200/24/eth1   ldirectord::/etc/ha.d/ldirectord.cf

ldirectord在啓動時需要指定配置文件。每一行定義一組資源，當主節點出現故障時，這一組資源會統統轉移至備節點上。上述192.168.1.200爲VIP，192.168.2.200爲DIP。

複製4個配置文件至其他節點（-p保留權限）

[root@node1 ha.d]# scp -p haresources ha.cf authkeys ldirectord.cf root@node2:/etc/ha.d/

關閉各節點上ldirectord服務並確保開機不自動啓動。

[root@node1 ha.d]# chkconfig ldirectord off ; ssh node2 'chkconfig ldirectord off'

啓動服務進行測試

[root@node1 ha.d]# service heartbeat start ; ssh node2 'service heartbeat start'

主節點上：

對應的VIP，DIP和集羣服務已經啓用！！！

讓主節點停止提供服務（模仿服務器故障）

[root@node1 ha.d]# /usr/lib64/heartbeat/hb_standby 
2015/08/06_01:17:42 Going standby [all].

對應的資源已轉移至node2上。

停止後方的一臺Real Server（RealS1）上的httpd服務

[root@node1 ~]# service httpd stop
Stopping httpd:                                            [  OK  ]

前端的Director上已將該Real Server標註爲不可用（Weight=0），且僅能訪問到Real Server2的頁面。將Real Server2上的httpd服務也停止，這時候就只有fallback頁面的信息了。

測試完成.................^_^

基於heartbeat v1+ldirectord實現LVS的高可用

[轉帖]使用NMT和pmap解決JVM資源泄漏問題原創

Python實現大麥網搶票的四大關鍵技術點解析

Python 安裝庫指令大全

salesforce零基礎學習（一百三十八）零碎知識點小總結（十）

一款開源的.NET程序集反編譯、編輯和調試神器

關於接口協議，你必須要知道這些！

【2024-05-21】以茶會友

Python多進程-multiprocess

python異常處理

Python網絡編程

Python中的反射

Python logging模塊

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結