67

On my elasticsearch server: total documents: 3 million, total size: 3.6G Then, I delete about 2.8 millions documents: total documents: about 0.13 million, total size: 3.6G

I have deleted the documents, how should I free the size of the documents?

Michael
  • 1,667
  • 2
  • 17
  • 18

5 Answers5

101

Deleting documents only flags these as deleted, so they would not be searched. To reclaim disk space, you have to optimize the index:

curl -XPOST 'http://localhost:9200/_optimize?only_expunge_deletes=true'

documentation: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-optimize.html

The documentation has moved to: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-forcemerge.html

Update

Starting with Elasticsearch 2.1.x, optimize is deprecated in favor of forcemerge. The API is the same, only the endpoint did change.

curl -XPOST 'http://localhost:9200/_forcemerge?only_expunge_deletes=true'
GMc
  • 1,764
  • 1
  • 8
  • 26
knutwalker
  • 5,924
  • 2
  • 22
  • 29
  • 7
    Knut's answer is correct. Here's an article that goes a bit more into detail about why it's like that: https://www.found.no/foundation/elasticsearch-from-the-bottom-up/ – Alex Brasetvik Dec 16 '13 at 11:37
  • 1
    You both are helpful. Thanks so much! – Michael Dec 17 '13 at 02:16
  • 1
    I agree, that it's a good answer, but how much space do you need in order to delete the deleted files? I'm running the said command, and currently (still running) the index has increased by 10 Gig. – Danielson Oct 26 '15 at 15:15
  • 4
    The index is split in several segments. An optimize will merge those segments while dropping the deleted documents in the process. The index data is copied during the merge, that's where the increase comes from. The worst case merge requires 2x the current index size during the merge and it's quite possible for such a large merge to take hours. – knutwalker Oct 26 '15 at 16:16
34

In the current elasticsearch version(7.5),

  1. To optimize all indices:

    POST /_forcemerge?only_expunge_deletes=true

  2. To optimize single index

    POST /twitter/_forcemerge?only_expunge_deletes=true , where twitter is the index

  3. To optimize several indices

    POST /twitter,facebook/_forcemerge?only_expunge_deletes=true , where twitter and facebook are the indices

Reference: https://www.elastic.co/guide/en/elasticsearch/reference/7.5/indices-forcemerge.html#indices-forcemerge

Community
  • 1
  • 1
6

knutwalker's answer is correct. However if you are using AWS ElasticSearch and want to free storage space, this will not quite work.

On AWS the index to forgemerge must be specified in the URL. It can include wildcards as is common with index rotation.

curl -XPOST 'https://something.es.amazonaws.com/index-*/_forcemerge?only_expunge_deletes=true'

AWS publishes a list of ElasticSearch API differences.

Steve E.
  • 9,003
  • 6
  • 39
  • 57
1

I just want to note that the 7.15 docs for the Force Merge API include this warning:

Force merge should only be called against an index after you have finished writing to it. Force merge can cause very large (>5GB) segments to be produced, and if you continue to write to such an index then the automatic merge policy will never consider these segments for future merges until they mostly consist of deleted documents. This can cause very large segments to remain in the index which can result in increased disk usage and worse search performance.

So you should shut down writes to the index before beginning.

Noumenon
  • 5,099
  • 4
  • 53
  • 73
-1

Replace indexname with yours. It will immediately free up space

curl -XPOST 'http://localhost:9200/indexname/_forcemerge' -d 
'{"only_expunge_deletes": false, "max_num_segments": 1 }'

Vim
  • 578
  • 5
  • 16