Ran out of diskspace and that screwed the elasticsearch shards. Three nodes are now in red, two got recovered and their state is yellow. ES is running 150% on CPU and high on memory, trying to recover them. But looks like there is some version match conflict.
I cleared up the disk space and deleted the translog for a shard to stop loading from translog. But surprisingly the translog gets created again!
Please share how can I stop this attempt to recover from translog and resume normal index operations. I do not want to delete the shard data.
[2014-10-31 03:11:43,742][WARN ][cluster.action.shard ] [Angela Cairn] [western_europe][4] sending failed shard for [western_europe][4], node[x5M73qVXS5eZIBdz40boEg], [P], s[INITIALIZING], indexUUID [wy-tIJqdQiynz5SGQ2IrGA], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[western_europe][4] failed to recover shard]; nested: ElasticsearchException[failed to read [tweet][527924645014818817]]; nested: ElasticsearchIllegalArgumentException[No version type match [101]]; ]]
[2014-10-31 03:11:43,742][WARN ][cluster.action.shard ] [Angela Cairn] [western_europe][4] received shard failed for [western_europe][4], node[x5M73qVXS5eZIBdz40boEg], [P], s[INITIALIZING], indexUUID [wy-tIJqdQiynz5SGQ2IrGA], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[western_europe][4] failed to recover shard]; nested: ElasticsearchException[failed to read [tweet][527924645014818817]]; nested: ElasticsearchIllegalArgumentException[No version type match [101]]; ]]
[2014-10-31 03:11:43,859][WARN ][indices.cluster ] [Angela Cairn] [western_europe][2] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [western_europe][2] failed to recover shard
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:269)
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.ElasticsearchException: failed to read [tweet][527936245440065536]
at org.elasticsearch.index.translog.Translog$Index.readFrom(Translog.java:511)
at org.elasticsearch.index.translog.TranslogStreams.readTranslogOperation(TranslogStreams.java:52)
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:241)
... 4 more
Caused by: org.elasticsearch.ElasticsearchIllegalArgumentException: No version type match [116]
at org.elasticsearch.index.VersionType.fromValue(VersionType.java:307)
at org.elasticsearch.index.translog.Translog$Index.readFrom(Translog.java:508)