10

Is it possible to reduce the number of shards in ElasticSearch search engine once the index is created ?

I tried :

$ curl -XPUT 'localhost:9200/myindex/_settings' -d '{"index" : {"number_of_shards" : 3}}'

But it gives an error :

{"error":"ElasticsearchIllegalArgumentException[can't change the number of shards for an index]","status":400}
Fedir RYKHTIK
  • 9,844
  • 6
  • 58
  • 68

3 Answers3

2

This is no longer true, with 5.x you can shring the number of indexes to a whole fraction. For example from 12 you could go down to 1, 2, 3 or 6 (see the docs). But you must put it to read-only mode and naturally the shrinking process requires lots of IO.

NikoNyrh
  • 3,578
  • 2
  • 18
  • 32
1

No, it's not possible. You could change a lot of stuff - e.g. number of replicas for each shard, or many other index settings, but not the number of shards.

For more information - take a look here - http://www.elastic.co/guide/en/elasticsearch/reference/1.5/indices-update-settings.html

Mysterion
  • 9,050
  • 3
  • 30
  • 52
1

Ok. Like @Mysterion said, it's not possible to change the number of shards with zero-downtime directly with an index update. But there is another way around.

You'll be needing to re-index your old index into an new index after creating it with the desired number of shards. (Like I said no zero-downtime)

For that you can use the Scroll Search API :

While a search request returns a single “page” of results, the scroll API can be used to retrieve large numbers of results (or even all results) from a single search request, in much the same way as you would use a cursor on a traditional database.

Scrolling is not intended for real time user requests, but rather for processing large amounts of data, e.g. in order to reindex the contents of one index into a new index with a different configuration.

Client support for scrolling and reindexing : Some of the officially supported clients provide helpers to assist with scrolled searches and reindexing of documents from one index to another:

Perl See Search::Elasticsearch::Bulk and Search::Elasticsearch::Scroll

Python See elasticsearch.helpers.*

For more information about the Scroll Search API, I suggest the official documentation

And you might also want to take a look at this answer here, maybe it can also give you some ideas in case you are using Java.

Community
  • 1
  • 1
eliasah
  • 39,588
  • 11
  • 124
  • 154
  • You can use https://github.com/taskrabbit/elasticsearch-dump to copy the data to a new index with the correct number of shards and then remove the old one. that program makes easier than trying to use directly the scroll search api – higuita Mar 16 '17 at 18:44
  • It's a very good project but when you have tens of millions of documents in your index, it's extremely slow. I've benchmarked it against optimized scan and scroll with the official python API and it the later runs 10 times faster (at least). Thus I'll let you judge ;-) – eliasah Mar 16 '17 at 19:00
  • you need to increase the batch size, the default of 100 make it slower. Also, my first run took about 2 hours, the second run took 35min... so indexes caches make a huge differences – higuita Mar 18 '17 at 13:50