Oracle12cRAC安装遭遇CLSRSC-507: The root script cannot proceed on this node

教程发布:风哥 教程分类:ITPUX技术网 更新日期:2022-02-12 浏览学习:1334

Oracle RAC 12c安装过程遭遇CLSRSC-507: The root script cannot proceed on this node (id:1919825.1)

APPLIES TO:Oracle Database - Enterprise Edition - Version 12.1.0.2 and later
Information in this document applies to any platform.

PURPOSEThe note lists known issues regarding the following error: [backcolor=rgb(255, 249, 215)]CLSRSC-507: The root script cannot proceed on this node because either the first-node operations have not completed on node or there was an error in obtaining the status of the first-node operations.

DETAILSCase 1: root script didn't succeed on first nodeGrid Infrastructure root script (root.sh or rootupgrade.sh) needs to be completed successfully on node1 or first node before it can be ran on other nodes; first node is the one on which the runInstall/config.sh ran, this is new in 12.1.0.2.If this is the case, complete root script on node1 before running it on other nodes.
Case 2: root script completed on first node but other nodes fail to obtain the status due to ocrdump issueIn this case, it's confirmed that root script finished on node1:/cfgtoollogs/crsconfig/rootcrs__.log[backcolor=rgb(255, 249, 215)]2014-08-22 10:23:10: Invoking "/opt/ogrid/12.1.0.2/bin/cluutil -exec -ocrsetval -key SYSTEM.rootcrs.checkpoints.firstnode -value SUCCESS"
2014-08-22 10:23:10: trace file=/opt/oracle/crsdata/inari/crsconfig/cluutil0.log
2014-08-22 10:23:10: Executing cmd: /opt/ogrid/12.1.0.2/bin/cluutil -exec -ocrsetval -key SYSTEM.rootcrs.checkpoints.firstnode -value SUCCESS
2014-08-22 10:23:10: Succeeded in writing the key pair (SYSTEM.rootcrs.checkpoints.firstnode:SUCCESS) to OCR
2014-08-22 10:23:10: Executing cmd: /opt/ogrid/12.1.0.2/bin/clsecho -p has -f clsrsc -m 325
2014-08-22 10:23:10: Command output:
> CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded
>End Command output
2014-08-22 10:23:10: CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded
And root script fails on other nodes as ocrdump failed/cfgtoollogs/crsconfig/rootcrs__.log[backcolor=rgb(255, 249, 215)]2014-09-04 13:45:34: ASM_DISKS=ORCL:OCR01,ORCL:OCR02,ORCL:OCR03
....
2014-09-04 13:46:04: Check the existence of global ckpt 'checkpoints.firstnode'
2014-09-04 13:46:04: setting ORAASM_UPGRADE to 1
2014-09-04 13:46:04: Invoking "/product/app/12.1.0.2/grid/bin/cluutil -exec -keyexists -key checkpoints.firstnode"
2014-09-04 13:46:04: trace file=/product/app/grid/crsdata/sipr0-db04/crsconfig/cluutil8.log
2014-09-04 13:46:04: Running as user grid: /product/app/12.1.0.2/grid/bin/cluutil -exec -keyexists -key checkpoints.firstnode
2014-09-04 13:46:04: s_run_as_user2: Running /bin/su grid -c ' echo CLSRSC_START; /product/app/12.1.0.2/grid/bin/cluutil -exec -keyexists -key checkpoints.firstnode '
2014-09-04 13:46:05: Removing file /tmp/fileRiu5NI
2014-09-04 13:46:05: Successfully removed file: /tmp/fileRiu5NI
2014-09-04 13:46:05: pipe exit code: 256
2014-09-04 13:46:05: /bin/su exited with rc=1
2014-09-04 13:46:05: oracle.ops.mgmt.rawdevice.OCRException: PROC-32: Cluster Ready Services on the local node is not running Messaging error [gipcretConnectionRefused] [29]
2014-09-04 13:46:05: Cannot get OCR key with CLUUTIL, try using OCRDUMP.
2014-09-04 13:46:05: Check OCR key using ocrdump
2014-09-04 13:46:22: ocrdump output: PROT-302: Failed to initialize ocrdump
2014-09-04 13:46:22: The key pair with keyname: SYSTEM.rootcrs.checkpoints.firstnode does not exist in OCR.
2014-09-04 13:46:22: Checking a remote host sipr0-db03 for reachability...

Case 2.1 ocrdump fails due to error AMDU-00201 and AMDU-00200 /crs//crs/trace/ocrdump_.trc[backcolor=rgb(255, 249, 215)]2014-09-04 13:46:14.044274 : OCRASM: proprasmo: ASM instance is down. Proceed to open the file in dirty mode.
CLWAL: clsw_Initialize: Error [32] from procr_init_ext
CLWAL: clsw_Initialize: Error [PROCL-32: Oracle High Availability Services on the local node is not running Messaging error [gipcretConnectionRefused] [29]] from procr_init_ext
2014-09-04 13:46:14.050831 : GPNP: clsgpnpkww_initclswcx: [at clsgpnpkww.c:351] Result: (56) CLSGPNP_OCR_INIT. (:GPNP01201:)Failed to init CLSW-OLR context. CLSW Error (3): CLSW-3: Error in the cluster registry (OCR) layer. [32] [PROCL-32: Oracle High Availability Services on the local node is not running Messaging error [gipcretConnectionRefused] [29]]
2014-09-04 13:46:14.093544 : OCRASM: proprasmo: Error [13] in opening the GPNP profile. Try to get offline profile
2014-09-04 13:46:16.210708 : OCRRAW: kgfo_kge2slos error stack at kgfolclcpi1: AMDU-00200: Unable to read [32768] bytes from Disk N0050 at offset [140737488355328]
AMDU-00201: Disk N0050: '/dev/sdg'
AMDU-00200: Unable to read [32768] bytes from Disk N0049 at offset [140737488355328]
AMDU-00201: Disk N0049: '/dev/sdf'
AMDU-00200: Unable to read [32768] bytes from Disk N0048 at offset [140737488355328]
AMDU-00201: Disk N0048: '/dev/sde'
AMDU-00200: Unable to read [32768] bytes from Disk N0035 at offset [140737488355328]
AMDU-00201: Disk N0035: '/dev/sdaw'
AMDU-00200: Unable to read [32768] bytes from Disk N0024 at offset [140737488355328]
AMDU-00201: Disk N0024: '/dev/sdaq'
....
2014-09-04 13:46:16.212934 : OCRASM: proprasmo: Failed to open file in dirty mode
2014-09-04 13:46:16.212964 : OCRASM: proprasmo: dgname is [OCRVOTE] : discoverystring []
2014-09-04 13:46:16.212990 : OCRASM: proprasmo: Error in open/create file in dg [OCRVOTE]
OCRASM: SLOS : SLOS: cat=8, opn=kgfolclcpi1, dep=200, loc=kgfokge
2014-09-04 13:46:16.213075 : OCRASM: ASM Error Stack :
....
2014-09-04 13:46:22.690905 : OCRASM: proprasmo: kgfoCheckMount returned [7]
2014-09-04 13:46:22.690933 : OCRASM: proprasmo: The ASM instance is down
2014-09-04 13:46:22.692150 : OCRRAW: proprioo: Failed to open [+OCRVOTE/sipr0-dbhv1/OCRFILE/registry.255.857389203]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
2014-09-04 13:46:22.692204 : OCRRAW: proprioo: No OCR/OLR devices are usable
2014-09-04 13:46:22.692239 : OCRRAW: proprinit: Could not open raw device
2014-09-04 13:46:22.692561 : default: a_init:7!: Backend init unsuccessful : [26]
2014-09-04 13:46:22.692777 : OCRDUMP: Failed to initailized OCR context. Error [PROC-26: Error while accessing the physical storage
] [26].
2014-09-04 13:46:22.692822 : OCRDUMP: Failed to initialize ocrdump stage 2
2014-09-04 13:46:22.692864 : OCRDUMP: Exiting [status=failed]...

Solution:The solution is to apply https://support.oracle.com/epmos/faces/ui/patch/PatchDetail.jspx?parent=DOCUMENT&sourceId=1919825.1&patchId=18456643]patch 18456643, then re-run root script.
一般遇到这种情况的比较多,处理方法是: 完全删除卸载oracle,重新进行安装,安装步骤如下:
1、正常安装grid集群件
2、在提升执行root脚本的时候,两个节点均不执行该步骤
3、分别在两节点安装补丁18456643
4、按照顺序在两个节点执行root.sh脚本,12c rac集群件安装成功:

Case 2.2 ocrdump fails: AMDU-00210 AMDU-00205 AMDU-00201 AMDU-00407 asmlib error asm_close asm_open /crs//crs/trace/ocrdump_.trc[backcolor=rgb(255, 249, 215)]OCRASM: proprasmo: ASM instance is down. Proceed to open the file in dirty mode.
2014-09-09 13:52:04.131609 : OCRRAW: kgfo_kge2slos error stack at kgfolclcpi1: AMDU-00210: No disks found in diskgroup CRSGRP
AMDU-00210: No disks found in diskgroup CRSGRP
AMDU-00205: Disk N0033 open failed during deep discovery.
AMDU-00201: Disk N0033: 'ORCL:REDOA'
AMDU-00407: asmlib error!! function = [asm_close], error = [0], mesg = [Invalid argument]
AMDU-00407: asmlib error!! function = [asm_open], error = [0], mesg = [Operation not permitted]
....
2014-09-09 13:52:04.131691 : OCRRAW: kgfoOpenDirty: dg=CRSGRP diskstring= filename=/opt/oracle/crsdata/drcsvr713/output/tmp_amdu_ocr_CRSGRP_09_09_2014_13_52_04
....
2014-09-09 13:52:04.131756 : OCRRAW: Category: 8
2014-09-09 13:52:04.131767 : OCRRAW: DepInfo: 210
....
OCRRAW: proprioo: No OCR/OLR devices are usable
OCRRAW: proprinit: Could not open raw device
default: a_init:7!: Backend init unsuccessful : [26]
OCRDUMP: Failed to initailized OCR context. Error [PROC-26: Error while accessing the physical storage] [26].
OCRDUMP: Failed to initialize ocrdump stage 2
OCRDUMP: Exiting [status=failed]...

Solution:The cause is that asmlib is used but not properly configured as confirmed by the output of the following commands on all nodes: /etc/init.d/oracleasm listdisks
/etc/init.d/oracleasm scandisks
/etc/init.d/oracleasm listdisks
/etc/init.d/oracleasm listdisks | xargs /etc/init.d/oracleasm querydisk -d
/etc/init.d/oracleasm status
/usr/sbin/oracleasm configure
ls -l /dev/oracleasm/disks/*
rpm -qa | grep oracleasm
uname -a

It's recommended to use AFD (ASM Filter Driver) instead of ASBLIB, but if ASMLIB must be used, fix the misconfiguration, then re-run root script.

Case 2.3 ocrdump fails as amdu core dumped/crs//crs/trace/ocrdump_.trc[backcolor=rgb(255, 249, 215)]2014-08-27 14:34:33.077433 : OCRRAW: kgfo_kge2slos error stack at kgfolclcpi1: AMDU-00210: No disks found in diskgroup QUORUM
AMDU-00210: No disks found in diskgroup QUORUM
....
2014-08-27 14:34:39.262032 : OCRASM: proprasmo: kgfoCheckMount returned [7]
2014-08-27 14:34:39.262041 : OCRASM: proprasmo: The ASM instance is down
2014-08-27 14:34:39.262521 : OCRRAW: proprioo: Failed to open [+QUORUM/wrac-cl-tor/OCRFILE/registry.255.856261165]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
2014-08-27 14:34:39.262540 : OCRRAW: proprioo: No OCR/OLR devices are usable
2014-08-27 14:34:39.262552 : OCRRAW: proprinit: Could not open raw device
2014-08-27 14:34:39.262668 : default: a_init:7!: Backend init unsuccessful : [26]
2014-08-27 14:34:39.262743 : OCRDUMP: Failed to initailized OCR context. Error [PROC-26: Error while accessing the physical storage
] [26].
2014-08-27 14:34:39.262760 : OCRDUMP: Failed to initialize ocrdump stage 2

amdu command core dumps:[backcolor=rgb(255, 249, 215)]$ amdu -diskstring 'ORCL:*'
amdu_2014_09_09_14_35_43/
amdu: ossdebug.c:1136: ossdebug_init_diag: Assertion `0' failed.
Aborted (core dumped)

Solution:At the time of this writing, the issu s still being worked in https://support.oracle.com/epmos/faces/BugDisplay?parent=DOCUMENT&sourceId=1919825.1&id=19592048]bug 19592048, engage Oracle Support for further help.

Case 2.4 same disk name points to different storage on different node/crs//crs/trace/ocrdump_.trc[backcolor=rgb(255, 249, 215)]2014-09-10 13:12:53.429460 : OCRASM: proprasmo: Error [13] in opening the GPNP profile. Try to get offline profile
2014-09-10 13:12:53.435300 : OCRRAW: kgfo_kge2slos error stack at kgfolclcpi1: AMDU-00210: No disks found in diskgroup DATA01
AMDU-00210: No disks found in diskgroup DATA01

amdu command output on node1[backcolor=rgb(255, 249, 215)]Disk Path: /dev/asm-data001
Unique Disk ID:
Disk Label:
Physical Sector Size: 512 bytes
Disk Size: 409600 megabytes
Group Name: DATA01
Disk Name: DATA01_0000
Failure Group Name: DATA01_0000

amdu command output on node2[backcolor=rgb(255, 249, 215)]Disk Path: /dev/asm-data001
Unique Disk ID:
Disk Label:
Physical Sector Size: 512 bytes
Disk Size: 409600 megabytes
** NOT A VALID ASM DISK HEADER. BAD VALUE IN FIELD blksize_kfdhdb **

Solution:The solution is to engage SysAdmin to fix the disk setup issue.

Case 2.5 same storage sub-system are shared by different clusters and same diskgroup name exists in more than one cluster/crs//crs/trace/ocrdump_.trc[backcolor=rgb(255, 249, 215)]2015-07-17 16:57:00.532160 : OCRRAW: AMDU-00211: Inconsistent disks in diskgroup OCR

Solution:The issue was investigated in https://support.oracle.com/epmos/faces/BugDisplay?parent=DOCUMENT&sourceId=1919825.1&id=21469989]bug 21469989, the cause is that multiple clusters are having the same diskgroup name and seeing the same shared disks, the workaround is to change diskgroup name for the new cluster.An example will be that both cluster1 and cluster2 are seeing the same physical disks /dev/mappers/disk1-10, disk1-5 are allocated to cluster1 and disk6-10 are allocated to cluster2, however, both cluster are trying to use the same diskgroup name dgsys. Ref: BUG 21469989 - CLSRSC-507 ROOT.SH FAILING ON NODE 2 WHEN CHECKING GLOBAL CHECKPOINT
Case 2.6 root user is seeing the same physical disks multiple times because of different path/crs//crs/trace/ocrdump_.trc[backcolor=rgb(255, 249, 215)]2015-07-17 16:57:00.532160 : OCRRAW: AMDU-00211: Inconsistent disks in diskgroup OCR

Solution:The solution is to ensure disk string is set correctly and root user is only seeing the same physical disk once.Ref: BUG 21164225 - OCRDUMP FAILS WITH AMDU-211 ONLY ON NORMAL REDUNDANCY

Case 3: root script completed on first node but other nodes fail to obtain the status as ocrdump wasn't executedIn this case, it's confirmed that root script finished on node1:/cfgtoollogs/crsconfig/rootcrs__.log[backcolor=rgb(255, 249, 215)]CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded
And root script fails on other nodes as ocrdump wasn't executed:[backcolor=rgb(255, 249, 215)]2014-08-28 17:53:55: Check the existence of global ckpt 'checkpoints.firstnode'
2014-08-28 17:53:55: setting ORAASM_UPGRADE to 1
2014-08-28 17:53:55: Invoking "/opt/12.1.0.2/grid/bin/cluutil -exec -keyexists -key checkpoints.firstnode"
2014-08-28 17:53:55: trace file=/opt/oracle/crsdata/racnode2/crsconfig/cluutil3.log
2014-08-28 17:53:55: Running as user oracle: /opt/12.1.0.2/grid/bin/cluutil -exec -keyexists -key checkpoints.firstnode
2014-08-28 17:53:55: s_run_as_user2: Running /bin/su oracle -c ' echo CLSRSC_START; /opt/12.1.0.2/grid/bin/cluutil -exec -keyexists -key checkpoints.firstnode '
2014-08-28 17:53:56: Removing file /tmp/fileZCubj2
2014-08-28 17:53:56: Successfully removed file: /tmp/fileZCubj2
2014-08-28 17:53:56: pipe exit code: 0 ====>>>> cluutil failed with PROC-32 but exit code 0
2014-08-28 17:53:56: /bin/su successfully executed
2014-08-28 17:53:56: oracle.ops.mgmt.rawdevice.OCRException: PROC-32: Cluster Ready Services on the local node is not running Messaging error [gipcretConnectionRefused] [29]
2014-08-28 17:53:56: Checking a remote host dblab01 for reachability...
....
2014-08-28 17:53:57: CLSRSC-507: The root script cannot proceed on this node dblab02 because either the first-node operations have not completed on node dblab01 or there was an error in obtaining the status of the first-node operations.

cluutil trace /crsdata/racnode2/crsconfig/cluutil3.log confirms it failed: [backcolor=rgb(255, 249, 215)][main] [ 2014-08-29 17:40:46.750 EDT ] [OCR.:278] ocr Error code = 32
[main] [ 2014-08-29 17:40:46.750 EDT ] [ClusterExecUtil.executeCmd:168] Exception caught: PROC-32: Cluster Ready Services on the local node is not running Messaging error [gipcretConnectionRefused] [29]
[main] [ 2014-08-29 17:40:46.750 EDT ] [ClusterUtil.main:236] ClusterUtil.execute rc: 1

The issue was investigated in https://support.oracle.com/epmos/faces/BugDisplay?parent=DOCUMENT&sourceId=1919825.1&id=19570598]bug 19570598:BUG 19570598 - ROOT.SH FAILS ON NODE2 WHILE CHECKING GLOBAL FIRST NODE CHECKPOINT

本文标签:
网站声明:本文由风哥整理发布,转载请保留此段声明,本站所有内容将不对其使用后果做任何承诺,请读者谨慎使用!
【上一篇】
【下一篇】