From reading on Sharding in SOLR, it is being said that - when an index becomes too large to fit on a single system, an index can be split into multiple shards. How much is too large? How much amount of indexed data is too large to consider sharding in SOLR?
Asked
Active
Viewed 605 times
1 Answers
1
To me, too large means one or combination of the following:
- does not fit onto the disk (hint: having in mind excess capacity for optimization)
- does not fit into the RAM
- searches are too slow, even after warming it up
-
Hi mindas! Appreciate your response!! For first bullet, could you elaborate on "excess capacity for optimization"? – sunskin Feb 04 '14 at 15:03
-
2When a document is deleted from the index, it stays on the disk until segment merge happens. When merging, Lucene allocates extra disk space for new segment, merges the data in and deletes the old segments afterwards. Merge behaviour has changed over the time, read this for more details: http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html – mindas Feb 04 '14 at 15:09
-
Thank you! Is it the same as SOLR Optimization of indexed data? – sunskin Feb 04 '14 at 15:12
-
2Yes. Solr uses Lucene under the covers. – mindas Feb 04 '14 at 15:12