I have set up MySQL NDB Cluster 7.3.5 and the cluster was working fine.
Cluster with 4 nodes :
NodeA : SQLNode1, DataNode1
NodeB : SQLNode2, DataNode2
NodeC : Mgmt Node1
NodeD : Mgmt Node2
To test the server reboot scenario I rebooted VMWare ESXi and restarted all VMs.
But the data nodes are subsequently failing to start.
Adding logs for the servers respectively:
/home/mysql/mysqlcluster_data/1/ndb_1_out.log (Data Node 1)
error: [ code: 708 line: 38848236 node: 1 count: 1 status: 32687 key: 445914048 name: 'hhmefep/def/fgvmev0000000000-elog-1398414831' ]
2014-05-13 13:16:40 [ndbd] INFO -- Failed to recreate object 505 during restart, error 708.
2014-05-13 13:16:40 [ndbd] INFO -- DBDICT (Line: 4688) 0x00000000
2014-05-13 13:16:40 [ndbd] INFO -- Error handler restarting system
2014-05-13 13:16:40 [ndbd] INFO -- Error handler shutdown completed - exiting
2014-05-13 13:16:40 [ndbd] ALERT -- Angel detected too many startup failures(3), not restarting again
2014-05-13 13:16:40 [ndbd] ALERT -- Node 1: Forced node shutdown completed. Occured during startphase 4. Caused by error 2355: 'Failure to restore schema(Resource configuration error). Permanent error, external action needed'.
It seems that the nodes are failing to recover this table: hhmefep.fgvmev0000000000-elog-1398414831
/home/mysql/mysqlcluster_data/2/ndb_2_out.log (Data Node 2)
2014-05-13 13:05:48 [ndbd] INFO -- Start phase 1 completed
2014-05-13 13:05:48 [ndbd] INFO -- Start phase 2 completed
2014-05-13 13:05:48 [ndbd] INFO -- Start phase 3 completed
2014-05-13 13:05:51 [ndbd] INFO -- Node 1 disconnected
2014-05-13 13:05:51 [ndbd] INFO -- QMGR (Line: 3308) 0x00000000
2014-05-13 13:05:51 [ndbd] INFO -- Error handler restarting system
2014-05-13 13:05:51 [ndbd] INFO -- Error handler shutdown completed - exiting
2014-05-13 13:05:51 [ndbd] ALERT -- Angel detected too many startup failures(3), not restarting again
2014-05-13 13:05:51 [ndbd] ALERT -- Node 2: Forced node shutdown completed. Occured during startphase 4. Caused by error 2308: 'Another node failed during system restart, please investigate error(s) on other node(s)(Restart error). Temporary error, restart node'.
It seems that data node 2 is trying to sync with data node 1 but has been forcefully shutdown by management node.
(Mgmt Node)
ndb_mgm> Node 1: Forced node shutdown completed, restarting. Occured during startphase 4. Caused by error 2355: 'Failure to restore schema(Resource configuration error). Permanent error, external action needed'.
Node 1: Forced node shutdown completed, restarting. Occured during startphase 4. Caused by error 2355: 'Failure to restore schema(Resource configuration error). Permanent error, external action needed'.
Node 1: Forced node shutdown completed. Occured during startphase 4. Caused by error 2355: 'Failure to restore schema(Resource configuration error). Permanent error, external action needed'.
Node 2: Forced node shutdown completed, restarting. Occured during startphase 4. Caused by error 2308: 'Another node failed during system restart, please investigate error(s) on other node(s)(Restart error). Temporary error, restart node'.
Node 2: Forced node shutdown completed, restarting. Occured during startphase 4. Caused by error 2355: 'Failure to restore schema(Resource configuration error). Permanent error, external action needed'.
ndb_mgm> Node 2: Forced node shutdown completed. Occured during startphase 4. Caused by error 2355: 'Failure to restore schema(Resource configuration error). Permanent error, external action needed'.
Please help me on this since it is very frustrating.