25

How to remove old data from elastic search index as the index has large amount of data being inserted every day.

sri
  • 331
  • 1
  • 4
  • 11

3 Answers3

22

You can do that with delete by query plugin.

Assuming you have some timestamp or creation date field in your index, your query would look something like this

DELETE /your_index/your_type/_query
{
  "query": {
    "range": {
      "timestamp": {
        "lte": "now-10y"
      }
    }
  }
}

This will delete records older than 10 years.

I hope this helps

ChintanShah25
  • 12,366
  • 3
  • 43
  • 44
  • Is there any way to do it in the form of a script so we don't have to run the query manually – sri Dec 09 '15 at 03:23
  • Does this delete query erase the records completely from the index and allow space for new records to be inserted – sri Dec 09 '15 at 03:26
  • 1
    You can set up a `cron job` to do it daily, records are **not** erased after you perform delete, they are **marked** as deleted and during segment merging they are actually removed, you might see increase in index size after deletion or update, you can use [force merge](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-forcemerge.html) to optimize your index – ChintanShah25 Dec 09 '15 at 03:46
  • What is segment merging ? After segment merging is the size of index decreased – sri Dec 09 '15 at 03:51
  • 1
    go through [this](https://www.elastic.co/guide/en/elasticsearch/guide/current/merge-process.html) to understand segment merging, you cant decrease size like that, just deleted data wont be there after merging segments. Also [read](https://www.elastic.co/blog/found-elasticsearch-from-the-bottom-up) about how elasticsearch works to have better understanding. – ChintanShah25 Dec 09 '15 at 04:18
12

Split data to daily indexes and use alias as old index name. then Delete the each index daily. just as logstash:

Daily indices :logstash-20151011,logstash-20151012,logstash-20151013.

Full Alias: logstash

Then daily delete last index.

Ali Nikneshan
  • 3,500
  • 27
  • 39
  • 2
    Using multiple indexes is the way to go. To delete the older indexes you could use curator: https://www.elastic.co/guide/en/elasticsearch/client/curator/current/index.html – slim Jul 01 '16 at 16:38
  • 1
    this is the best answer – Luc E Sep 03 '20 at 10:38
0

If you are using time-based indices, that should be something like:

curl -XDELETE http://localhost:9200/test-2017-06
Atif Hussain
  • 880
  • 12
  • 19