OCR是保存整個集羣信息的存儲,它放在共享存儲上面,如果它要是出問題了CRS就會出現問題。
[root@racr1 ~]# su - oracle
racr1->
racr1-> ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 524024
Used space (kbytes) : 3832
Available space (kbytes) : 520192
ID : 801841162
Device/File Name : /dev/raw/raw1
Device/File integrity check succeeded
Device/File not configured
Cluster registry integrity check succeeded
racr1->
通常的情況下,我們可以通過ocrcheck命令來查看ocr的存放路徑和當前狀態。
還有一個命令它維護OCR磁盤的,那就是ocrconfig
racr1-> ocrconfig
Name:
ocrconfig - Configuration tool for Oracle Cluster Registry.
Synopsis:
ocrconfig [option]
option:
-export <filename> [-s online]
- Export cluster register contents to a file
-import <filename> - Import cluster registry contents from a file
-upgrade [<user> [<group>]]
- Upgrade cluster registry from previous version
-downgrade [-version <version string>]
- Downgrade cluster registry to the specified version
-backuploc <dirname> - Configure periodic backup location
-showbackup - Show backup information
-restore <filename> - Restore from physical backup
-replace ocr|ocrmirror [<filename>] - Add/replace/remove a OCR device/file
-overwrite - Overwrite OCR configuration on disk
-repair ocr|ocrmirror <filename> - Repair local OCR configuration
-help - Print out this help information
Note:
A log file will be created in
$ORACLE_HOME/log/<hostname>/client/ocrconfig_<pid>.log. Please ensure
you have file creation privileges in the above directory before
running this tool.
racr1->
可以看到這個命令有備份和恢復的參數,應該是用來備份ocr信息的。
下面就演示一個ocr破壞後如果恢復的。
[root@racr1 ~]# cd /u01/app/oracle/product/10.2.0/crs_1/bin/
[root@racr1 bin]#
[root@racr1 bin]# ./ocrconfig -export /home/oracle/ocrexp.exp -s online
[root@racr1 bin]#
[root@racr1 bin]# cd /home/oracle/
[root@racr1 oracle]# ls
ocrexp.exp OracleHome.tar
[root@racr1 oracle]#
先做好一個ocr的備份,默認情況下,每4個小時就會自動備份一次。可以使用ocrconfig的-showbackup來看。
既然做好了備份,那就開始破壞了。
[root@racr1 bin]# dd if=/dev/zero of=/dev/raw/raw1 bs=1024 count=102400
102400+0 records in
102400+0 records out
104857600 bytes (105 MB) copied, 20.4541 seconds, 5.1 MB/s
[root@racr1 bin]#
[root@racr1 bin]#
這樣就破壞了現在集羣使用的ocr信息了。我們可以在使用ocrchec來看看現在的情況
[root@racr1 oracle]# su - oracle
racr1->
racr1-> ocrcheck
PROT-601: Failed to initialize ocrcheck
命令沒辦法執行了,一點是ocr壞了。通過cluvfy工具也能驗證ocr壞了。
racr1-> /opt/clusterware/cluvfy/runcluvfy.sh comp ocr -n all
Verifying OCR integrity
Unable to retrieve nodelist from Oracle clusterware.
Verification cannot proceed.
racr1->
都壞了,就恢復吧。
[root@racr1 bin]# ./ocrconfig -import /opt/ocrexp.exp
PROT-19: Cannot proceed while clusterware is running. Shutdown clusterware first
[root@racr1 bin]#
[root@racr1 bin]# ./crsctl stop crs
OCR initialization failed with invalid format: PROC-22: The OCR backend has an invalid format
我在恢復的時候,遇到一個問題,就是import的時候要我關集羣,關集羣提示我OCR有問題,這個時候只能kill進程了,那麼kill誰呢
[root@racr1 bin]# ps -ef | grep crs
root 17039 1841 0 23:51 pts/4 00:00:00 grep crs
root 17715 14950 0 22:53 ? 00:00:00 /bin/su -l oracle -c sh -c 'ulimit -c unlimited; cd /u01/app/oracle/product/10.2.0/crs_1/log/racr1/evmd; exec /u01/app/oracle/product/10.2.0/crs_1/bin/evmd '
oracle 17719 17715 0 22:53 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/bin/evmd.bin
root 18171 17860 0 22:53 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/bin/oprocd.bin run -t 1000 -m 500 -f
root 18218 17867 0 22:53 ? 00:00:00 /sbin/runuser -l oracle -c /bin/sh -c 'cd /u01/app/oracle/product/10.2.0/crs_1/log/racr1/cssd/oclsomon; ulimit -c unlimited; /u01/app/oracle/product/10.2.0/crs_1/bin/oclsomon || exit $?'
oracle 18219 18218 0 22:53 ? 00:00:00 /bin/sh -c cd /u01/app/oracle/product/10.2.0/crs_1/log/racr1/cssd/oclsomon; ulimit -c unlimited; /u01/app/oracle/product/10.2.0/crs_1/bin/oclsomon || exit $?
oracle 18243 18219 0 22:53 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/bin/oclsomon.bin
oracle 18344 17920 0 22:53 ? 00:00:03 /u01/app/oracle/product/10.2.0/crs_1/bin/ocssd.bin
oracle 18495 17719 0 22:53 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/bin/evmlogger.bin -o /u01/app/oracle/product/10.2.0/crs_1/evm/log/evmlogger.info -l /u01/app/oracle/product/10.2.0/crs_1/evm/log/evmlogger.log
oracle 18781 1 0 22:53 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/opmn/bin/ons -d
oracle 18782 18781 0 22:53 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/opmn/bin/ons -d
root 23225 1 0 23:03 ? 00:00:00 /bin/sh /etc/init.d/init.crsd run
root 24573 23225 0 23:06 ? 00:00:06 /u01/app/oracle/product/10.2.0/crs_1/bin/crsd.bin restart
[root@racr1 bin]#
root@racr1 bin]# kill -9 24573
大家看到了,主要是幹掉init.crsd這個進程。然後就可以導入ocr的備份信息了
[root@racr1 bin]# ./ocrconfig -import /opt/ocrexp.exp
[root@racr1 bin]# ./crsctl start crs
Attempting to start CRS stack
The CRS stack will be started shortly
[root@racr1 bin]#
[root@racr1 bin]# ./crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
這個整個集羣就起來了,但是還不能用
[root@racr1 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....D1.inst application ONLINE OFFLINE
ora....D2.inst application ONLINE OFFLINE
ora.PROD.db application ONLINE OFFLINE
ora....SM1.asm application ONLINE OFFLINE
ora....R1.lsnr application ONLINE OFFLINE
ora.racr1.gsd application ONLINE OFFLINE
ora.racr1.ons application ONLINE OFFLINE
ora.racr1.vip application ONLINE OFFLINE
ora....SM2.asm application ONLINE OFFLINE
ora....R2.lsnr application ONLINE OFFLINE
ora.racr2.gsd application ONLINE OFFLINE
ora.racr2.ons application ONLINE OFFLINE
ora.racr2.vip application ONLINE OFFLINE
[root@racr1 bin]#
[root@racr1 bin]# crs_start -all
-bash: crs_start: command not found
[root@racr1 bin]# ./crs_start -all
Attempting to start `ora.racr1.ASM1.asm` on member `racr1`
Attempting to start `ora.racr1.vip` on member `racr1`
Attempting to start `ora.racr2.vip` on member `racr2`
Attempting to start `ora.racr2.ASM2.asm` on member `racr2`
Start of `ora.racr1.vip` on member `racr1` succeeded.
Attempting to start `ora.racr1.LISTENER_RACR1.lsnr` on member `racr1`
Start of `ora.racr2.vip` on member `racr2` succeeded.
Attempting to start `ora.racr2.LISTENER_RACR2.lsnr` on member `racr2`
Start of `ora.racr1.LISTENER_RACR1.lsnr` on member `racr1` succeeded.
Start of `ora.racr2.LISTENER_RACR2.lsnr` on member `racr2` succeeded.
Start of `ora.racr1.ASM1.asm` on member `racr1` succeeded.
Attempting to start `ora.PROD.PROD1.inst` on member `racr1`
Start of `ora.racr2.ASM2.asm` on member `racr2` succeeded.
Attempting to start `ora.PROD.PROD2.inst` on member `racr2`
Start of `ora.PROD.PROD1.inst` on member `racr1` succeeded.
Start of `ora.PROD.PROD2.inst` on member `racr2` succeeded.
CRS-1002: Resource 'ora.racr1.ons' is already running on member 'racr1'
CRS-1002: Resource 'ora.racr2.ons' is already running on member 'racr2'
CRS-1002: Resource 'ora.PROD.db' is already running on member 'racr2'
Attempting to start `ora.racr1.gsd` on member `racr1`
Attempting to start `ora.racr2.gsd` on member `racr2`
Start of `ora.racr1.gsd` on member `racr1` succeeded.
Start of `ora.racr2.gsd` on member `racr2` succeeded.
CRS-0223: Resource 'ora.PROD.db' has placement error.
CRS-0223: Resource 'ora.racr1.ons' has placement error.
CRS-0223: Resource 'ora.racr2.ons' has placement error.
[root@racr1 bin]# ./crs_s
-bash: ./crs_s: No such file or directory
[root@racr1 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....D1.inst application ONLINE ONLINE racr1
ora....D2.inst application ONLINE ONLINE racr2
ora.PROD.db application ONLINE ONLINE racr2
ora....SM1.asm application ONLINE ONLINE racr1
ora....R1.lsnr application ONLINE ONLINE racr1
ora.racr1.gsd application ONLINE ONLINE racr1
ora.racr1.ons application ONLINE ONLINE racr1
ora.racr1.vip application ONLINE ONLINE racr1
ora....SM2.asm application ONLINE ONLINE racr2
ora....R2.lsnr application ONLINE ONLINE racr2
ora.racr2.gsd application ONLINE ONLINE racr2
ora.racr2.ons application ONLINE ONLINE racr2
ora.racr2.vip application ONLINE ONLINE racr2
[root@racr1 bin]#
最後再驗證一下就好了
[root@racr1 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 524024
Used space (kbytes) : 3832
Available space (kbytes) : 520192
ID : 673724156
Device/File Name : /dev/raw/raw1
Device/File integrity check succeeded
Device/File not configured
Cluster registry integrity check succeeded
[root@racr1 bin]#
好了,ocrchek可以用。默認沒輸入一次ocrcheck命令,那麼在$CRS_HOME/log/<NODENAME>/client目錄下就會產生一個ocrcheck_pid.log日誌文件。