660 views
in RAC by ACE (20,920 points)

1 Answer

by ACE (20,920 points)

Frequently, in the case of node reboots, the log of the CSS daemon processes (ocssd.log) indicates that the network heartbeat from one or more remote nodes was not received (for example, the message "CRS-1610:Network communication with node xxxxxx (3) missing for 90% of timeout interval.

Removal of this node from cluster in 2.656 seconds" appears in the ocssd.log), and that the node subsequently was rebooted.

 

The following script is an example from the three node cluster:

#!/bin/ksh

export TODAY=`date "+%Y%m%d"`
while [ $TODAY -lt 20121231 ] # format needs to be YearMonthDate
do
export TODAY=`date "+%Y%m%d"`
export LOGFILE=/tmp/interconnect_test_${TODAY}.log
ssh drrac1-priv "hostname; date" >> $LOGFILE 2>&1
ssh drrac2-priv "hostname; date" >> $LOGFILE 2>&1
ssh drrac3-priv "hostname; date" >> $LOGFILE 2>&1

echo "" >> $LOGFILE
echo "" >> $LOGFILE

sleep 5
done

 

...