1

I have multiple elasticsearch clusters, every cluster has the same indices with the same data with the same number of documents. But there is a significant difference in the index size. I tried to use merge api but it's not helping. The issue is, because of this elasticsearch is eventually running out of space:

{
    "state": "UNASSIGNED",
    "primary": true,
    "node": null,
    "relocating_node": null,
    "shard": 3,
    "index": "local-deals-1624295772015",
    "recovery_source":
    {
        "type": "EXISTING_STORE"
    },
    "unassigned_info":
    {
        "reason": "ALLOCATION_FAILED",
        "at": "2021-08-18T19:14:20.472Z",
        "failed_attempts": 20,
        "delayed": false,
        "details": "shard failure, reason [lucene commit failed], failure IOException[No space left on device]",
        "allocation_status": "deciders_no"
    }
}

I have configured the elasticsearch cluster to not have more than 2 shards per node to improve the query performance.

Cluster-1: enter image description here

Cluster-2: enter image description here

Given these two clusters with the same documents, there is a difference of 90% in the index size which is not making sense to me. Can someone explain this behavior?

My quick fix is to increase the EBS volume.

Response to @Val's question: There are multiple documents that are marked for deletion.

"5": {
    "health": "yellow",
    "status": "open",
    "index": "local-deals-1624295772015",
    "uuid": "s7QDLtuhRN6HM_VwtVTB0Q",
    "pri": "6",
    "rep": "1",
    "docs.count": "8911560",
    "docs.deleted": "18826270",
    "store.size": "37gb",
    "pri.store.size": "19.9gb"
}
Vishrant
  • 15,456
  • 11
  • 71
  • 120
  • 1
    For one, in the second cluster you have replica shards which already contribute to a factor of 2. Also can you share the result of `GET _cat/indices?v`? You might have a lot of documents flagged as deleted (e.g. if you update them frequently). – Val Aug 19 '21 at 05:43
  • @Val thanks for the input, there are multiple documents that are marked for deletion, so it seems like that is an issue? Do you know how to completely clean the deleted documents? – Vishrant Aug 19 '21 at 20:41
  • https://stackoverflow.com/a/20608904/2704032 this will delete the documents and reclaim the space, but the issue with `_forcemerge?only_expunge_deletes=true` API is that, it's a blocking call and it will make the elasticsearch cluster unresponsive to the search requests. – Vishrant Aug 19 '21 at 21:25

1 Answers1

2

You can try to run _forcemerge indeed. It is not a blocking call, it triggers an asynchronous task that will run in the background until the job is done. You don't need to wait for the call to return in order to force merge segments.

Also know that this will not remove all deleted documents, but a good deal of them depending on the ratio deleted/docs.

You can find more info on the different merge settings in the MergePolicyConfig.java class.

Val
  • 207,596
  • 13
  • 358
  • 360
  • I was reading this document https://aws.amazon.com/premiumsupport/knowledge-center/es-deleted-documents/ and it says `The force merge operation triggers an I/O intensive process and blocks all new requests to your cluster until the merge is complete` – Vishrant Aug 20 '21 at 14:23
  • It's not correct, you can still perform queries on your cluster while forcemerge is running. It is true, though, that it is a resource intensive operation and you should trigger it with care, off peak – Val Aug 20 '21 at 14:25
  • Do you know if there is any setting that makes the elasticsearch run the merge operation run more frequently? – Vishrant Aug 20 '21 at 14:35
  • 2
    It's running all the time, see this video: https://www.youtube.com/watch?v=YW0bOvLp72E – Val Aug 20 '21 at 14:37