I have an application that creates a rather large Solr 3.6 index, approx. 300GB with 1B documents divided into 10 cores each day. Indexing works great, and I’m using a round-robin algorithm to distribute the docs evenly between the cores. Searches work great for me too up to the point that the return result set is greater than 100K+ documents.
At that point, I get a java error returned: either OutOfMemoryError or SolrException: parsing error
My searches are simple, not using wildcards or sorting or faceted search, yet it seems to buffer the entire result set before returning it. The physical memory on my server is 256G and I am running Solaris 10. I’m using the default java in 32 bit, but have also tried java 7 in 32 and 64 bit.
When I use 64 bit java, I am able to increase the max memory enough to return 1M+ documents with the –Xmx option, but it requires practically all the memory I have for just a single Solr process.
Other than re-designing my application with hundreds of tiny indexes, does anyone have any suggestions on how to get large search result sets out of Solr without huge amounts of RAM?