设置了RemoveIPC=yes 的RHEL7.2 会crash掉Oracle asm 实例和Oracle database实例
Oracle Database - Standard Edition
Oracle Database - Enterprise Edition
Linux x86-64
Linux x86
DESCRIPTIONOn Redhat 7.2, systemd-logind service introduced a new feature to remove all IPC objects when a user fully logs out.
The feature is controled by the option RemoveIPC in the /etc/systemd/logind.conf configuration file,
see man logind.conf(5) for details.The default value for RemoveIPC in RHEL7.2 is yes.As a result, when the last oracle or grid user disconnects, the OS removes shared memory segments and semaphores for those users.
As Oracle ASM and Databases use shared memory segments for SGA, removing shared memory segments will crash the Oracle ASM and database instances.
Please refer to the Redhat bug 1264533 - https://bugzilla.redhat.com/show_bug.cgi?id=1264533]https://bugzilla.redhat.com/show_bug.cgi?id=1264533
OCCURRENCEThe problem affects all applications including Oracle Databases that use the shared memory segments and semaphores; thus, both, Oracle ASM and database instances are affected.Oracle Linux 7.2 avoids this problem by setting RemoveIPC to no explicitly on /etc/systemd/logind.conf configuration file,
but if /etc/systemd/logind.conf is touched or modified before the upgrade started, the yum/update will write the correct/new configuration file (with RemoveIPC=no) as logind.conf.rpmnew,
and if user retains their original configuration file, then most likely the failures described in this note will occur.
To avoid this problem, after the upgrade be sure to edit the logind.conf and set RemoveIPC=no. This is documented in the Oracle Linux 7.2 release notes.SYMPTOMS1) Installing 11.2 and 12c GI/CRS fails, because ASM crashes towards the end of the installation.2) Upgrading to 11.2 and 12c GI/CRS fails.3) After Redhat Linux is upgraded to 7.2, 11.2 and 12c ASM and database instances crash.
The removal of the IPC objects by systemd-logind may happen at any time, as such the failure patterns can vary greatly, here are some examples of how failures may look like:
[backcolor=rgb(255, 249, 215)]Most common error that occurs is that the following is found in the asm or database alert.log:
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1
[backcolor=rgb(255, 249, 215)]The second observed error occurs during installation and upgrade when asmca fails with the following error:
KFOD-00313: No ASM instances available. CSS group services were successfully initilized by kgxgncin
KFOD-00105: Could not open pfile 'init@.ora'
[backcolor=rgb(255, 249, 215)]The third observed error occurred during installation and upgrade:
Creation of ASM password file failed. Following error occurred: Error in Process: /u01/app/12.1.0/grid/bin/orapwd
Enter password for SYS:
OPW-00009: Could not establish connection to Automatic Storage Management instance
2015/11/20 21:38:45 CLSRSC-184: Configuration of ASM failed
2015/11/20 21:38:46 CLSRSC-258: Failed to configure and start ASM
[backcolor=rgb(255, 249, 215)]The fourth observed error is the following message is found in the /var/log/messages file around the time that asm or database instance crashed:
Nov 20 21:38:43 testc201 kernel: traps: oracle[24861] trap divide error
ip:3896db8 sp:7ffef1de3c40 error:0 in oracle[400000+ef57000]
WORKAROUND1) Set RemoveIPC=no in /etc/systemd/logind.conf2) Reboot the server or restart systemd-logind as follows:
# systemctl daemon-reload
# systemctl restart systemd-logindPATCHESMigrating to Oracle Linux 7.2 from Redhat 7.2 resolves this problem.If migrating to Oracle Linux 7.2 is not possible, please use the above workaround by setting RemoveIPC=no in /etc/systemd/logind.confHISTORY 23-Nov-2015 The alert is created