WINDOWS X64平臺超出32核CPU 安裝11.2.0.3 GI 藍屏

昨天同事安裝11G RAC遇到藍屏,查詢METALINK,

Oracle Clusterware installation(10.2 - 11.1 CRS or 11.2 Grid Infrastructure) fails with a blue screen, WinDbg shows BugCheck D1 (DRIVER_IRQL_NOT_LESS_OR_EQUAL) against orafencedrv.sys or orafenceservice.sys or ntkrnlmp.exe on x64 Windows cluster with more than 32 Logical Processor(s).

OR

Oracle Clusterware fails to start with same symptoms after Logical Processor(s) are increased to above 32


To find out the number of Logical Processor(s):

~~~~
Run "msinfo32" from the command line or via the "Run" option.
There will be a section like this.
~~~~~
...
System Type x64-based PC
Processor Intel(R) Xeon(R) CPU           E7540  @ 2.00GHz, 2000 Mhz, 6 Core(s), 12 Logical Processor(s)
Processor Intel(R) Xeon(R) CPU           E7540  @ 2.00GHz, 2000 Mhz, 6 Core(s), 12 Logical Processor(s)
Processor Intel(R) Xeon(R) CPU           E7540  @ 2.00GHz, 2000 Mhz, 6 Core(s), 12 Logical Processor(s)
Processor Intel(R) Xeon(R) CPU           E7540  @ 2.00GHz, 2000 Mhz, 6 Core(s), 12 Logical Processor(s)
....

This shows 4 physical CPUs each of which has 12 Logical Processors. This equals 48 ( 4 x 12 ) processors as far as the orafenceservice.sys is concerned and will cause the problem mentioned in this note.

 

 

The issue is caused by bug 10637621, relevant information can be seen in the following bugs:

Bug:9901433 BLUE SCREEN OCCURED DURING INSTALLING CLUSTERWARE for 11.1.0.7.
Bug:10027132 DURING INSTALLATION GI WIN X64 11.2 SERVER BUGCHECKS 0XD1 IN ORAFENCESERVIC.SYS and was determined to be a duplicate of Bug:9901433.
Bug:10637621 11.2.0.2 WIN GI INSTALL @ BUGCHECKS WITH 0XD1 IN ORAFENCESERVICE.SYS

For 11.2.0.3, the issue is reported in Bug 14276345 - WINDOWS: Installing 11.2.0.3 GI on server with more than 32 CPU may crash <Doc ID 14276345.8>

The x64 fencing code currently has a limit of 32 processors. 

 

 

 

SOLUTION

bug 10637621 is fixed in :

   10.2.0.4 Patch 43 patch 11731126
   10.2.0.5 Patch 9 patch 12332704
   11.2.0.2 Patch 3 patch 11731184
 
For 11.2.0.3, the Bug 14276345 is fixed in 11.2.0.3 Patch 11 onwards.
    
It's recommended to apply latest patches to fix the issue; if patch is unavailable, please engage Oracle Support to request.


As the issue affects initial clusterware configuration, here's a few workarounds:

1. Reduce number of CPUs to below 32, install clusterware, apply necessary patches and restore original number of CPUs.

2. OR install on node with less than 32 CPU, apply necessary patches and clone it to target cluster.

3. Or for 11.2.0.2 and above, utilize new feature Software Update Option to apply the patch before clusterware is configured. Refer to screenshots for details.

 

 

毅然拔掉更多的CPU。。。裝完再插回去。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章