We got alert on RDBMS home file system is above 90% full and investigation revealed crf directory under grid infrastructure home was above 26G.
once you know its $ORACLE_HOME/crf and you can easily guess these are the files repository oforacle Cluster Health Monitor
This issue was observed in our two node cluster running Oracle Grid Infrastructure 11g 11.2.0.4.0 on Linux.
--- To confirm / Crosscheck.
$ORACLE_HOME/bin/oclumon manage -get reppath
CHM Repository Path = /cluster/app/11.2.0/grid/crf/db/cmsdb02
Done
[root@cmsdb02 cmsdb02]# cd /cluster/app/11.2.0/grid/crf/db/cmsdb02
[root@cmsdb02 cmsdb02]# ls -lrt
total 25830096
-rw-r----- 1 root root 8192 Sep 2 07:07 repdhosts.bdb
-rw-r----- 1 root root 24576 Sep 2 07:07 __db.001
-rw-r----- 1 root root 8192 Sep 2 07:10 crfconn.bdb
-rw-r--r-- 1 root root 120000000 Sep 2 07:11 cmsdb02.ldb
-rw-r----- 1 root root 16777216 Oct 27 13:11 log.0000030408
-rw-r----- 1 root root 16777216 Oct 27 13:42 log.0000030409
-rw-r----- 1 root root 401408 Oct 27 13:42 __db.002
-rw-r----- 1 root root 329068544 Oct 27 13:42 crfts.bdb
-rw-r----- 1 root root 508063744 Oct 27 13:42 crfloclts.bdb
-rw-r----- 1 root root 399507456 Oct 27 13:42 crfhosts.bdb
-rw-r----- 1 root root 404766720 Oct 27 13:42 crfcpu.bdb
-rw-r----- 1 root root 24340660224 Oct 27 13:42 crfclust.bdb
-rw-r----- 1 root root 407449600 Oct 27 13:42 crfalert.bdb
-rw-r----- 1 root root 57344 Oct 27 13:43 __db.006
-rw-r----- 1 root root 1187840 Oct 27 13:43 __db.005
-rw-r----- 1 root root 2162688 Oct 27 13:43 __db.004
-rw-r----- 1 root root 2629632 Oct 27 13:43 __db.003
Bug/Fix :
Checking on metalink confirmed its a bug
Bug 12711827 : HUGE CRFCLUST.BDB WHICH CAN NOT BE REDUCED IN FILESYSTEM 11.2.0.2
Bug 14479330 : HUGE SIZE OF CRFCLUST.BDB - ORACLE 11.2.0.3
Look like based bug was was fixed with below in 11.2.0.3 to control the size but there is not fix mentioned for any of above bugs exclusively.
Patch 10165314: CHM/CRF/IPDOS REPOSITORY EXCEEDS 1GB AFTER ADD/REMOVE NODE OR FRESH INSTALL
Workaround/Solution.
temporary workaround of the problem is to delete the CHM repository files and ORA.CRF should be down when you cleanup.
For detail Check metalink 1343105.1 Oracle Cluster Health Monitor (CHM) using large amount of space (more than default)
set oracle ASM home.
$ . oraenv
+ASM1
$ crsctl stat res ora.crf -init
NAME=ora.crf
TYPE=ora.crf.type
TARGET=ONLINE
STATE=ONLINE on cmsdb02
$oclumon manage -get reppath
CHM Repository Path = /cluster/app/11.2.0/grid/crf/db/cmsdb02
Done
# Stop CHM (Cluster health monitor) process ora.crf
$ crsctl stop res ora.crf -init
CRS-2673: Attempting to stop 'ora.crf' on 'cmsdb02'
CRS-2677: Stop of 'ora.crf' on 'cmsdb02' succeeded
# Delete the revelent files. Please remember you need root password to delete the files as they owned by root.
$ cd /cluster/app/11.2.0/grid/crf/db/cmsdb02
$ rm *.bdb
# Start CHM (Cluster health monitor) process ora.crf
$ crsctl start res ora.crf -init
# Check process is back up and running
$ crsctl stat res ora.crf -init
NAME=ora.crf
TYPE=ora.crf.type
TARGET=ONLINE
STATE=ONLINE on cmsdb02
once you know its $ORACLE_HOME/crf and you can easily guess these are the files repository oforacle Cluster Health Monitor
This issue was observed in our two node cluster running Oracle Grid Infrastructure 11g 11.2.0.4.0 on Linux.
--- To confirm / Crosscheck.
$ORACLE_HOME/bin/oclumon manage -get reppath
CHM Repository Path = /cluster/app/11.2.0/grid/crf/db/cmsdb02
Done
[root@cmsdb02 cmsdb02]# cd /cluster/app/11.2.0/grid/crf/db/cmsdb02
[root@cmsdb02 cmsdb02]# ls -lrt
total 25830096
-rw-r----- 1 root root 8192 Sep 2 07:07 repdhosts.bdb
-rw-r----- 1 root root 24576 Sep 2 07:07 __db.001
-rw-r----- 1 root root 8192 Sep 2 07:10 crfconn.bdb
-rw-r--r-- 1 root root 120000000 Sep 2 07:11 cmsdb02.ldb
-rw-r----- 1 root root 16777216 Oct 27 13:11 log.0000030408
-rw-r----- 1 root root 16777216 Oct 27 13:42 log.0000030409
-rw-r----- 1 root root 401408 Oct 27 13:42 __db.002
-rw-r----- 1 root root 329068544 Oct 27 13:42 crfts.bdb
-rw-r----- 1 root root 508063744 Oct 27 13:42 crfloclts.bdb
-rw-r----- 1 root root 399507456 Oct 27 13:42 crfhosts.bdb
-rw-r----- 1 root root 404766720 Oct 27 13:42 crfcpu.bdb
-rw-r----- 1 root root 24340660224 Oct 27 13:42 crfclust.bdb
-rw-r----- 1 root root 407449600 Oct 27 13:42 crfalert.bdb
-rw-r----- 1 root root 57344 Oct 27 13:43 __db.006
-rw-r----- 1 root root 1187840 Oct 27 13:43 __db.005
-rw-r----- 1 root root 2162688 Oct 27 13:43 __db.004
-rw-r----- 1 root root 2629632 Oct 27 13:43 __db.003
Bug/Fix :
Checking on metalink confirmed its a bug
Bug 12711827 : HUGE CRFCLUST.BDB WHICH CAN NOT BE REDUCED IN FILESYSTEM 11.2.0.2
Bug 14479330 : HUGE SIZE OF CRFCLUST.BDB - ORACLE 11.2.0.3
Look like based bug was was fixed with below in 11.2.0.3 to control the size but there is not fix mentioned for any of above bugs exclusively.
Patch 10165314: CHM/CRF/IPDOS REPOSITORY EXCEEDS 1GB AFTER ADD/REMOVE NODE OR FRESH INSTALL
Workaround/Solution.
temporary workaround of the problem is to delete the CHM repository files and ORA.CRF should be down when you cleanup.
For detail Check metalink 1343105.1 Oracle Cluster Health Monitor (CHM) using large amount of space (more than default)
set oracle ASM home.
$ . oraenv
+ASM1
$ crsctl stat res ora.crf -init
NAME=ora.crf
TYPE=ora.crf.type
TARGET=ONLINE
STATE=ONLINE on cmsdb02
$oclumon manage -get reppath
CHM Repository Path = /cluster/app/11.2.0/grid/crf/db/cmsdb02
Done
# Stop CHM (Cluster health monitor) process ora.crf
$ crsctl stop res ora.crf -init
CRS-2673: Attempting to stop 'ora.crf' on 'cmsdb02'
CRS-2677: Stop of 'ora.crf' on 'cmsdb02' succeeded
# Delete the revelent files. Please remember you need root password to delete the files as they owned by root.
$ cd /cluster/app/11.2.0/grid/crf/db/cmsdb02
$ rm *.bdb
# Start CHM (Cluster health monitor) process ora.crf
$ crsctl start res ora.crf -init
# Check process is back up and running
$ crsctl stat res ora.crf -init
NAME=ora.crf
TYPE=ora.crf.type
TARGET=ONLINE
STATE=ONLINE on cmsdb02
No comments:
Post a Comment