Wednesday 4 March 2015

EDIT GPNP PROFILE 11G R2 RAC

In 11g R2 RAC, Voting disk is stored on an ASM diskgroup. . But CSSD needs the voting files before ASM is online.  At its startup, CSSD scans the device headers of all devices specified in the gpnp profile xml file, tag “DiscoveryString” which contains the same value as specified in asm_diskstring parameter for ASM instance.

If it can find at least more than half the number of total no. of voting files, the party takes place,  otherwise,  CSSD will cycle with appropriate error messages in $GRID_HOME/log/hostname/log/cssd/ocssd.log for each loop.

To verify it, I modified ASM_DISKSTRING to such a value that the ASM disk containing voting disk was not included and tried to restart crs on a node.. Looking up the CSSD logfile, I saw that the CSSD had trouble identifying its voting files.

Now, I had to change back the ASM disk_string parameter without having ASM running, and with no CSSD available, which is necessary to start it? How do we tell the CSSD, that’s running fairly in advance of ASM, to scan the right devices?  I used gpnptool to edit the gpnp profile and restored the discovery string to the appropriate value. Now I could get crs running on my machine.

Overview:
– Move voting disk to a new diskgroup VOTE.
– Set ASM_DISKTRING to all the disks which are members of all the other diskgroups (DATA/FRA) using SQL
— stop and restart crs on the node –
– check that HAS services have started but rest of the services are not up
–Check the ocssd.log — Scans all the disks which are part of the discovery string  but does not find voting disk
– Edit the gpnp profile to modify discovery string  for asm
– Try to restart crs
– check that all the daemons have started and cluster servicves are up

Implementation:
– cretae a diskgroup with external redundancy to be used to store VD

– Move voting disk to the diskgroup vote – fails
 because
    — diskgroup not mounted on all the nodes – mount it
   — diskgroup compatibility < 11.2.0.0 – Modify it
[root@host01 ~]# crsctl replace votedisk +vote
Failed to create voting files on disk group vote.
Change to configuration failed, but was successfully rolled back.
CRS-4000: Command Replace failed, or completed with errors.
  – Mount vote diskgroup on all the nodes
  —  Modify diskgroup compatibility to 11.2.0.0
  — Move voting disk to the diskgroup vote – succeeds

[root@host01 ~]#  crsctl replace votedisk +vote
Successful addition of voting disk 443f1c60e16f4fa5bfbfeaae6b2f919d.
Successful deletion of voting disk 3d9da0d16baa4f10bf4e4b9b4aa688c6.
Successful deletion of voting disk 369621097c034f6dbf29a8cc97dc4bbc.
Successful deletion of voting disk 407b5d6588ea4f6fbf1503f7f6cc2951.
Successfully replaced voting disk group with +vote.
CRS-4266: Voting file(s) successfully replaced

– check that voting disk has been moved to vote diskgroup
[root@host01 ~]#  crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
–  —–    —————–                ——— ———
 1. ONLINE   443f1c60e16f4fa5bfbfeaae6b2f919d (ORCL:ASMDISK014) [VOTE]
Located 1 voting disk(s).

– Find out the disks in the other diskgroups – DATA and FRA
[grid@host01 ~]$ asmcmd lsdsk -G data
Path
ORCL:ASMDISK01
ORCL:ASMDISK010
ORCL:ASMDISK011
ORCL:ASMDISK012

[grid@host01 ~]$ asmcmd lsdsk -G fra
Path
ORCL:ASMDISK02
ORCL:ASMDISK03

– Set ASM_DISKTRING to all the disks which are members of DATA/FRA diskgroups
 SQL> sho parameter asm_diskstring
NAME                                 TYPE        VALUE
———————————— ———– ————
asm_diskstring                       string
SQL> alter system set asm_diskstring='ORCL:ASMDISK01','ORCL:ASMDISK010','ORCL:ASMDISK011','ORCL:ASMDISK012','ORCL:ASMDISK02','ORCL:ASMDISK03';

– check that discovery string in gpnp profile  points to the disks as specified 

[root@host01 ~]# cd /u01/app/11.2.0/grid/gpnp/host01/profiles/peer/
                         vi profile.xml

DiscoveryString=”ORCL:ASMDISK01,ORCL:ASMDISK010,ORCL:ASMDISK011,ORCL:ASMDISK012,ORCL:ASMDISK02,ORCL:ASMDISK03

— stop and restart crs on the node –
]
[root@host01 ~]# crsctl stop crs
  [root@host01 ~]# crsctl start  crs
– check that has has started but rest of the services are not up
[root@host01 ~]# crsctl check has
CRS-4638: Oracle High Availability Services is online
[root@host01 ~]# crsctl check crs
]CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon
CRS-4534: Cannot communicate with Event Manager

[root@host01 ~]# crsctl check css
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon

[root@host01 ~]# crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

–Check the ocssd.log

– Scans all the disks which are part of the discovery string but does not find voting disk
 [root@host01 ~]# vi $ORACLE_HOME/log/host01/cssd/ocssd.log
2013-01-21 12:55:14.759: [    CLSF][2985733008]Read ASM header off dev:ORCL:ASMDISK01:0:0
2013-01-21 12:55:14.759: [   SKGFD][2985733008]Lib :ASM:/opt/oracle/extapi/32/asm/orcl/1/libasm.so: closing handle 0x91d0410 for disk :ORCL:ASMDISK01:
2013-01-21 12:55:14.759: [    CLSF][2985733008]Read ASM header off dev:ORCL:ASMDISK010:0:0
2013-01-21 12:55:14.759: [   SKGFD][2985733008]Lib :ASM:/opt/oracle/extapi/32/asm/orcl/1/libasm.so: closing handle 0x91d0c10 for disk :ORCL:ASMDISK010:
2013-01-21 12:55:14.759: [    CLSF][2985733008]Read ASM header off dev:ORCL:ASMDISK011:0:0
2013-01-21 12:55:14.759: [   SKGFD][2985733008]Lib :ASM:/opt/oracle/extapi/32/asm/orcl/1/libasm.so: closing handle 0x91d1508 for disk :ORCL:ASMDISK011:
2013-01-21 12:55:14.759: [    CLSF][2985733008]Read ASM header off dev:ORCL:ASMDISK012:0:0
2013-01-21 12:55:14.759: [   SKGFD][2985733008]Lib :ASM:/opt/oracle/extapi/32/asm/orcl/1/libasm.so: closing handle 0x91d1e00 for disk :ORCL:ASMDISK012:
2013-01-21 12:55:14.760: [    CLSF][2985733008]Read ASM header off dev:ORCL:ASMDISK02:0:0
2013-01-21 12:55:14.760: [   SKGFD][2985733008]Lib :ASM:/opt/oracle/extapi/32/asm/orcl/1/libasm.so: closing handle 0x91d26f8 for disk :ORCL:ASMDISK02:
2013-01-21 12:55:14.760: [    CLSF][2985733008]Read ASM header off dev:ORCL:ASMDISK03:0:0
@
2013-01-21 12:55:14.760: [    CLSF][2985733008]Read ASM header off dev:ORCL:ASMDISK03:0:0
2013-01-21 12:55:14.760: [   SKGFD][2985733008]Lib :ASM:/opt/oracle/extapi/32/asm/orcl/1/libasm.so: closing handle 0x91d2ff0 for disk :ORCL:ASMDISK03:
2013-01-21 12:55:14.760: [    CSSD][2985733008]clssnmvDiskVerify: file is not a voting file, cannot recognize on-disk signature for a voting
.
.
.
2013-01-21 12:55:14.760: [    CSSD][2985733008]clssnmvDiskVerify: Successful discovery of 0 disks
2013-01-21 12:55:14.760: [    CSSD][2985733008]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2013-01-21 12:55:14.760: [    CSSD][2985733008]clssnmvFindInitialConfigs: No voting files found

– Copy the existing gpnp profile to profile.bak  and edit the copy to modify discovery string  for asm
[root@host01 ~]# cd /u01/app/11.2.0/grid/gpnp/host01/profiles/peer/
[root@host01 peer]# cp profile.xml profile.bak

– remove the oracle signature from the file –
[root@host01 peer]# gpnptool unsign -p=profile.bak

Warning: some command line parameters were defaulted. Resulting command line:
         /u01/app/11.2.0/grid/bin/gpnptool.bin unsign -p=profile.bak -o-

– change the DiscoveryString itself –
[root@host01 peer]# gpnptool edit -asm_dis='ORCL:*' -p=profile.bak -o=profile.bak -ovr
Resulting profile written to “profile.bak”.
Success.

–  sign the profile xml file with the wallet (Notice: the path is only the directory to the wallet, NOT the wallet file itself) –
[root@host01 peer]# gpnptool sign -p=profile.bak -w=file:/u01/app/11.2.0/grid/gpnp/host01/wallets/peer/ -o=profile.new
Resulting profile written to “profile.new”.
Success.

–  move the original profile.xml out of the way 
[root@host01 peer]# mv profile.new profile.xml

– check that discovery string has been modified
[root@host01 peer]# vi profile.xml
DiscoveryString=”ORCL:*”

– Try to restart crs 
[root@host01 peer]# ps -ef |grep d.bin
kill all the processes
[root@host01 peer]# crsctl start crs

– check that all the daemons have started and cluster servicves are up
[root@host01 peer]# ps -ef |grep d.bin
[root@host01 peer]#crsctl stat res -t

Conclusion:
To access voting disk, gpnp profile is read to find the location of voting disk : found as asm
The asm disks as per the discovery string in gpnp profile are scanned to search for voting disk.
– Note : GPNP profile on other nodes also contains erroneous discovery string.

                  Hence copy the profile from the current node to the other nodes of

No comments: