717 views
in RAC by ACE (20,920 points)

1 Answer

by ACE (20,920 points)

This is due to missing network heartbeat or split brain condition. In two node environment, repeated reboots of node 2 normally means that node 2 is evicted due to split brain. The ocssd.log shows missing network heartbeat or a split brain message before the node is rebooted.

 

Cause: the network communication over private interconnect between nodes failed. The failure can be uni-directional or bi-directional.

 

Solution: Fix the network problem. Make sure all network components like switch and NIC cards are working. Make sure ssh work over the private interconnect. Note that the network often works again after the node is rebooted.
Note: If jumbo frames are used, please refer to the note 341788.1 (Recommendation for the Real Application Cluster Interconnect and Jumbo Frames).  The network problem can occur and lead to node evictions or CRS start up problems if the switch is not set up properly to match the MTU (jumbo frame) set up of the NIC cards.  Sometimes, the switch and NIC cards from different vendors do not provide the same support for jumbo frames.

 

...