When you are using Oracle Clusterware 11g Release 1 cluster and you are facing an cluster outage due to node evictions it is very good possible that you are running into bug 860459.
To analyse if you are running into this issue you must see the error below in the ocssd.log files in the Clustereware home($ORACLE_HOME/log/<hostname>/cssd/ocssd*)
Perform a search on "exceeds max value". If you find these lines you are running into the issue.
[ CSSD]2010-01-25 [1115699552] >ERROR: ASSERT clssgm.c 1594
[ CSSD]2010-01-25 [1115699552] >ERROR: clssgmGetGrock: Group ID of 702053 exceeds max value for global groups
[ CSSD]2010-01-25 [1115699552] >TRACE: clssgmDiscOmonReady: omon was posted for member 3
[ CSSD]2010-01-25 [1115699552] >ERROR: ###################################
[ CSSD]2010-01-25 [1115699552] >ERROR: clssscExit: CSSD aborting from thread GM Peer Lsnr
[ CSSD]2010-01-25 [1115699552] >ERROR: ###################################
You will see also that the master node must be evicted to run into this bug.
You can find the master node if you search on CLSS-3001.
[ CSSD]CLSS-3000: reconfiguration successful, incarnation 103410002 with 1 nodes
[ CSSD]CLSS-3001: local node number 1, master node number 1
|