關於ORACLE RAC心跳問題的釋疑

1、rac心跳的作用:
檢測集羣節點間的網絡健康狀態,還可用做緩存同步刷新及全局資源維護。在grid control出現後還傳輸數據塊,其內聯數據通信量比較大,通常是千兆網,當然使用萬兆更好。

2、rac心跳能否用直連網線?
直連網線限制RAC至兩節點,另外直連網線不穩定,由此造成的BUG和技術問題,ORACLE不提供相應的技術支持。
具體看ORACLE官方解釋:
RAC: Frequently Asked Questions [ID 220970.1]中描述

Is crossover cable supported as an interconnect with RAC on any platform ?

NO. CROSS OVER CABLES ARE NOT SUPPORTED. The requirement is to use a switch:

 Detailed Reasons:

 1) cross-cabling limits the expansion of RAC to two nodes

 2) cross-cabling is unstable:

 a) Some NIC cards do not work properly with it. They are not able to negotiate the DTE/DCE clocking, and will thus not function. These NICS were made cheaper by assuming that the switch was going to have the clock. Unfortunately there is no way to know which NICs do not have that clock.

 b) Media sense behaviour on various OS's (most notably Windows) will bring a NIC down when a cable is disconnected. Either of these issues can lead to cluster instability and lead to ORA-29740 errors (node evictions).

 Due to the benefits and stability provided by a switch, and their afforability ($200 for a simple 16 port GigE switch), and the expense and time related to dealing with issues when one does not exist, this is the only supported configuration.

 From a purely technology point of view Oracle does not care if the customer uses cross over cable or router or switches to deliver a message. However, we know from experience that a lot of adapters misbehave when used in a crossover configuration and cause a lot of problems for RAC. Hence we have stated on certify that we do not support crossover cables to avoid false bugs and finger pointing amongst the various parties: Oracle, Hardware vendors, Os vendors etc...

3、rac心跳的高可用
rac心跳實現高可用,可使用雙網口綁定的技術,操作系統層面實現。雙網口綁定常見有負載均衡和主備模式。負載均衡可提供兩倍的帶寬(實際並達不到,只是可快一些),但從可靠性角度來說,建議主備模式。在主備模式下,當一個網絡接口失效時(例如主交換機掉電等),不會出現網絡中斷,系統會按照/etc/rc.d/rc.local裏指定的網卡順序工作,機器仍能對外服務,起到了失效保護的功能。


補充資料:

linux系統下bond mode參數說明:(mode=4 在交換機支持LACP時推薦使用,其能提供更好的性能和穩定性)

0-輪詢模式,所綁定的網卡會針對訪問以輪詢算法進行平分。
1-高可用模式,運行時只使用一個網卡,其餘網卡作爲備份,在負載不超過單塊網卡帶寬或壓力時建議使用。
2-基於HASH算法的負載均衡模式,網卡的分流按照xmit_hash_policy的TCP協議層設置來進行HASH計算分流,使各種不同處理來源的訪問都儘量在同一個網卡上進行處理。
3-廣播模式,所有被綁定的網卡都將得到相同的數據,一般用於十分特殊的網絡需求,如需要對兩個互相沒有連接的交換機發送相同的數據。
4-802.3ab負載均衡模式,要求交換機也支持802.3ab模式,理論上服務器及交換機都支持此模式時,網卡帶寬最高可以翻倍(如從1Gbps翻到2Gbps)
5-適配器輸出負載均衡模式,輸出的數據會通過所有被綁定的網卡輸出,接收數據時則只選定其中一塊網卡。如果正在用於接收數據的網卡發生故障,則由其他網卡接管,要求所用的網卡及網卡驅動可通過ethtool命令得到speed信息。
6-適配器輸入/輸出負載均衡模式,在”模式5″的基礎上,在接收數據的同時實現負載均衡,除要求ethtool命令可得到speed信息外,還要求支持對網卡MAC地址的動態修改功能。

4、rac雙心跳的可行性
rac心跳使用雙網口綁定後,是一個私有的地址隸屬於一個vlan,採用主備模式,兩條網線分別連接兩個不同的交換機。這是操作系統層面就可實現的。如果rac心跳採用兩個私有VLAN,那麼心跳就會有兩個私有地址。雙心跳地址間如何做負載均衡或主備模式,就由ORACLE數據庫自己來實現(操作系統層不再做綁定)。oracle在11G R2之後的版本11.2.0.2裏支持這種方式,由於這個HAIP新特性剛推出有BUG,建議大家使用11.2.0.4版更穩定。官方的舉例是針對多個數據庫instance高互連帶寬要求的。
官方具體說明請參見http://docs.oracle.com/database/121/RACAD/admin.htm#RACAD7295

文檔ID 1210883.1詳細介紹了HAIP,其中對HAIP的描述如下:
Redundant Interconnect without any 3rd-party IP failover technology (bond, IPMP or similar) is supported natively by Grid Infrastructure starting from 11.2.0.2.  Multiple private network adapters can be defined either during the installation phase or afterward using the oifcfg.  Oracle Database, CSS, OCR, CRS, CTSS, and EVM components in 11.2.0.2 employ it automatically.

Grid Infrastructure can activate a maximum of four private network adapters at a time even if more are defined. The ora.cluster_interconnect.haip resource will start one to four link local  HAIP on private network adapters for interconnect communication for Oracle RAC, Oracle ASM, and Oracle ACFS etc.

Grid automatically picks free link local addresses from reserved 169.254.*.* subnet for HAIP. According to RFC-3927, link local subnet 169.254.*.* should not be used for any other purpose. With HAIP, by default, interconnect traffic will be load balanced across all active interconnect interfaces, and corresponding HAIP address will be failed over transparently to other adapters if one fails or becomes non-communicative. .

The number of HAIP addresses is decided by how many private network adapters are active when Grid comes up on the first node in the cluster .  If there's only one active private network, Grid will create one; if two, Grid will create two; and if more than two, Grid will create four HAIPs. The number of HAIPs won't change even if more private network adapters are activated later, a restart of clusterware on all nodes is required for the number to change, however, the newly activated adapters can be used for fail over purpose. 

5、每一套業務系統數據庫的RAC心跳是否需要做vlan隔離?
oracle官方沒有明確說明,出於安全的特定要求,自己可以做VLAN隔離,小的VLAN比較多則會增加一些管理和配置成本。


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章