I have multiple elasticsearch clusters, every cluster has the same indices with the same data with the same number of documents. But there is a significant difference in the index size.
I tried to use merge
api but it's not helping. The issue is, because of this elasticsearch is eventually running out of space:
{
"state": "UNASSIGNED",
"primary": true,
"node": null,
"relocating_node": null,
"shard": 3,
"index": "local-deals-1624295772015",
"recovery_source":
{
"type": "EXISTING_STORE"
},
"unassigned_info":
{
"reason": "ALLOCATION_FAILED",
"at": "2021-08-18T19:14:20.472Z",
"failed_attempts": 20,
"delayed": false,
"details": "shard failure, reason [lucene commit failed], failure IOException[No space left on device]",
"allocation_status": "deciders_no"
}
}
I have configured the elasticsearch cluster to not have more than 2 shards per node to improve the query performance.
Given these two clusters with the same documents, there is a difference of 90% in the index size which is not making sense to me. Can someone explain this behavior?
My quick fix is to increase the EBS volume.
Response to @Val's question: There are multiple documents that are marked for deletion.
"5": {
"health": "yellow",
"status": "open",
"index": "local-deals-1624295772015",
"uuid": "s7QDLtuhRN6HM_VwtVTB0Q",
"pri": "6",
"rep": "1",
"docs.count": "8911560",
"docs.deleted": "18826270",
"store.size": "37gb",
"pri.store.size": "19.9gb"
}