Amazon Partner

Tuesday 27 October 2015

How to Cleanup huge crf / CHM repository file in Grid Infrastructure home 11.2.0.2 11.2.0.3 11.2.0.4

We got alert on RDBMS home file system is above 90% full and investigation revealed crf directory under grid infrastructure home was above 26G.

once you know its $ORACLE_HOME/crf and you can easily guess these are the files  repository oforacle Cluster Health Monitor

This issue was observed in our two node cluster running Oracle Grid Infrastructure 11g 11.2.0.4.0 on Linux.


---  To confirm / Crosscheck.

$ORACLE_HOME/bin/oclumon manage -get reppath

CHM Repository Path = /cluster/app/11.2.0/grid/crf/db/cmsdb02

Done


[root@cmsdb02 cmsdb02]# cd /cluster/app/11.2.0/grid/crf/db/cmsdb02
[root@cmsdb02 cmsdb02]# ls -lrt
total 25830096
-rw-r----- 1 root root        8192 Sep  2 07:07 repdhosts.bdb
-rw-r----- 1 root root       24576 Sep  2 07:07 __db.001
-rw-r----- 1 root root        8192 Sep  2 07:10 crfconn.bdb
-rw-r--r-- 1 root root   120000000 Sep  2 07:11 cmsdb02.ldb
-rw-r----- 1 root root    16777216 Oct 27 13:11 log.0000030408
-rw-r----- 1 root root    16777216 Oct 27 13:42 log.0000030409
-rw-r----- 1 root root      401408 Oct 27 13:42 __db.002
-rw-r----- 1 root root   329068544 Oct 27 13:42 crfts.bdb
-rw-r----- 1 root root   508063744 Oct 27 13:42 crfloclts.bdb
-rw-r----- 1 root root   399507456 Oct 27 13:42 crfhosts.bdb
-rw-r----- 1 root root   404766720 Oct 27 13:42 crfcpu.bdb
-rw-r----- 1 root root 24340660224 Oct 27 13:42 crfclust.bdb
-rw-r----- 1 root root   407449600 Oct 27 13:42 crfalert.bdb
-rw-r----- 1 root root       57344 Oct 27 13:43 __db.006
-rw-r----- 1 root root     1187840 Oct 27 13:43 __db.005
-rw-r----- 1 root root     2162688 Oct 27 13:43 __db.004
-rw-r----- 1 root root     2629632 Oct 27 13:43 __db.003



Bug/Fix :

Checking on metalink confirmed its a bug

Bug 12711827 : HUGE CRFCLUST.BDB WHICH CAN NOT BE REDUCED IN FILESYSTEM  11.2.0.2
Bug 14479330 : HUGE SIZE OF CRFCLUST.BDB - ORACLE 11.2.0.3

Look like based bug was was fixed with below in 11.2.0.3 to control the size but there is not fix mentioned for any of above bugs exclusively.

Patch 10165314: CHM/CRF/IPDOS REPOSITORY EXCEEDS 1GB AFTER ADD/REMOVE NODE OR FRESH INSTALL



Workaround/Solution.

temporary workaround of the problem is to delete the CHM repository files and ORA.CRF should be down when you cleanup.


For detail Check metalink  1343105.1 Oracle Cluster Health Monitor (CHM) using large amount of space (more than default)


set oracle ASM home.

$ . oraenv
+ASM1

$ crsctl stat res ora.crf -init
NAME=ora.crf
TYPE=ora.crf.type
TARGET=ONLINE
STATE=ONLINE on cmsdb02


$oclumon manage -get reppath

CHM Repository Path = /cluster/app/11.2.0/grid/crf/db/cmsdb02
Done

# Stop CHM (Cluster health monitor) process ora.crf

$ crsctl stop res ora.crf -init
CRS-2673: Attempting to stop 'ora.crf' on 'cmsdb02'
CRS-2677: Stop of 'ora.crf' on 'cmsdb02' succeeded


# Delete the revelent files. Please remember you need root password to delete the files as they owned by root.

$ cd /cluster/app/11.2.0/grid/crf/db/cmsdb02
$ rm *.bdb


# Start CHM (Cluster health monitor) process ora.crf

$ crsctl start res ora.crf -init


# Check process is back up and running


$ crsctl stat res ora.crf -init
NAME=ora.crf
TYPE=ora.crf.type
TARGET=ONLINE
STATE=ONLINE on cmsdb02

No comments:

Post a Comment