0

Using Elasticsearch curator, how do I delete all indices matching a pattern, except for the newest?

I tried using filtertype: age but it does not seem to do what I need.

Val
  • 207,596
  • 13
  • 358
  • 360
RFI
  • 3
  • 2

3 Answers3

2

Here is an example code which you can use to delete the indices which are older than 14 days assuming your index name have the date in it. You can get more information on the below link https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/curator.html

import os
import sys
import json, io, boto3
import time, datetime
import curator
from elasticsearch import Elasticsearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth
import boto3

esEndPoint = ES_HOST # Add the ElasticSearch host.
region = REGION # Region where the ElasticSearch is present.
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

def lambda_handler(event, context):
    esClient = connectES(esEndPoint)
    index_list = curator.IndexList(esClient)
    index_list.filter_by_age(source='name', direction='older', timestring='%Y.%m.%d', unit='days', unit_count=14)
    print(index_list.indices)
    if index_list.indices:
        curator.DeleteIndices(index_list).do_action() # Delete the indices

def connectES(esEndPoint): 
    # Function used to connect to ES
    try:
        es = Elasticsearch(
            hosts=[{'host': esEndPoint, 'port': 443}],
            http_auth=awsauth,
            use_ssl=True,
            verify_certs=True,
            connection_class=RequestsHttpConnection
        )
        return es
    except Exception as E:
        print("Unable to connect to {0}".format(esEndPoint))
        print(E)
Trilok Nagvenkar
  • 896
  • 9
  • 14
1

You need two filters: pattern (to match the indexes you want to delete) and age (to specify the age of the indexes to delete).

For instance the Curator configuration below is configured to delete

  • indexes named example_dev_*
  • and which are older than 10 days

Configuration:

actions:
  1:
    action: delete_indices
    description: >-
      Delete indices older than 10 days (based on index name), for example_dev_
      prefixed indices.
    options:
      ignore_empty_list: True
      disable_action: True
    filters:
    - filtertype: pattern
      kind: prefix
      value: example_dev_
    - filtertype: age
      source: creation_date
      direction: older
      unit: days
      unit_count: 10
    - filtertype: count
      count: 1

You need to adapt both filter conditions to your needs, but that would achieve what you expect.

Val
  • 207,596
  • 13
  • 358
  • 360
  • What if the index fits the pattern and also the age, but is the only index? In that case I would not want it deleted. – RFI Sep 19 '19 at 13:53
  • How are your indexes named? if you have all your data in a single index but you only want to delete old data from that index, then Curator won't help you, you need a solution like this one instead: https://stackoverflow.com/a/57975075/4604579 – Val Sep 19 '19 at 13:55
  • Indexes are named: example_dev_1_v{increment}, for example: example_dev_1_v1, example_dev_1_v5, example_dev_2_v1, example_dev_2_v2 – RFI Sep 19 '19 at 13:57
  • The question is do you need to always delete entire indexes, or sometimes only part of the index? – Val Sep 19 '19 at 13:59
  • I need to delete the entire index that fits pattern: example_dev_1_v... except for the latest version. Note that the value after the `v` may or may not be the highest value for the latest version. Is it not possible to use Curator to delete an index based on the creation date, except for the latest one? – RFI Sep 19 '19 at 14:01
  • Based on creation date, certainly... I've updated my answer accordingly – Val Sep 19 '19 at 14:05
  • Thanks for the help. To be clear, if only one index remains and that index fits the pattern AND is older than 10 days, it would not be deleted? – RFI Sep 19 '19 at 14:07
  • The above configuration deletes any index that matches the pattern AND is older than 10 days. Even if a single index remains. It seems that what you're after is "deleting data" that is older than 10 days, not "deleting indexes" – Val Sep 19 '19 at 14:08
  • In that case, do I also need to add `count` with a value of `1` since I want to keep at least one index, even if it matched the pattern and age? – RFI Sep 19 '19 at 14:13
  • Indeed a `count` filter would do the trick. Updated my asnwer – Val Sep 19 '19 at 14:18
  • You can easily test the configuration by using the `--dry-run` switch, nothing will happen to your indices, Curator will only simulate the actions. – Val Sep 19 '19 at 14:26
  • Thanks for `--dry-run`, it seems your answer solves my needs. Thanks for the help. – RFI Sep 19 '19 at 14:46
0

I suggest using the count filter after the pattern filter. Be sure to play with exclude true/false and dry-runs until it does what you expect.

untergeek
  • 863
  • 4
  • 13