I'm running Apache Nutch, which seems to work and in small runs will index documents and commit to Solr at the end of the run.
Unfortunately, I want to index deep within some large sites and Nutch won't commit to the end of a run.
This has obvious issues when you're looking at 100k+ documents being stacked up waiting to commit with pressure on memory, having to wait so long for the data, etc.
Is there a way to persuade Nutch to commit more frequently?