My application has low write throughput and I can manage 2-3 minutes for changes to reflect in solr search results.
Currently I do commits via my indexing application (after every batch of documents) and also have the following configured on solr side:
solr.autoSoftCommit.maxTime : -1 (disabling auto soft commit)
solr.autoCommit.maxTime : 300000 (5 mins of hard auto commit interval)
opensearcher : false
The reasons for choosing the configuration comes from my understanding of the following:
- My application being read heavy needs high amount of caching and I can't afford to get my cached flushed. Thus, I've disabled the soft commits altogether.
- I've disabled opensearcher as again if I won't do it it'll invalidate the top level caches which isn't desirable
In production, I've observed that as soon as my application tries to index even 1 document (or a batch) and then issue a commit statement (from my application) all my top level caches gets expunged.
I thought maybe just relying on hard auto commit will help, but according to this stack overflow link
Hard commits are about durability, soft commits are about visibility. There are really two flavors here, openSearcher=true and openSearcher=false. First we’ll talk about what happens in both cases. If openSearcher=true or openSearcher=false, the following consequences are most important:
The tlog is truncated: A new tlog is started. Old tlogs will be deleted if there are more than 100 documents in newer, closed tlogs. The current index segment is closed and flushed. Background segment merges may be initiated. The above happens on all hard commits. That leaves the openSearcher setting
openSearcher=true: The Solr/Lucene searchers are re-opened and all caches are invalidated. Autowarming is done etc. This used to be the only way you could see newly-added documents.
openSearcher=false: Nothing further happens other than the four points above. To search the docs, a soft commit is necessary.
So to sum it up a soft commit will flush caches and so will an auto hard commit with opensearcher=true. While auto hard commit with opensearcher=false will not allow the changes I added to be reflected.
Please do point me out if I've misunderstood anything.
Now here are my questions :
- Is there no way to ensure that the top level filter caches are not expunged when some documents are added to the index and have the changes available at the same time?
- If that is the case, then do I need to always have to rely on warmup of caches to get some documents in caches?
- Are there any other approaches than warmup which folks usually do to avoid this; if they want to build a fast searchable product and having some write throughput as well?
I've read several documentation links and articles but I couldn't find any proper one explaining what settings to be used in different scenarios. It'll be really helpful if someone can explain what I'm doing wrong and guide me to a proper solution.