0

I have read http://solr.pl/en/2011/12/19/do-i-have-to-look-for-maxbooleanclauses-when-using-filters/ and Too many boolean clauses exception in solr.

My Solr has about 2 million documents. I can retrieve specific documents by set query or filter query to the IDs of the specific documents. By doing this, I can find the facets and clusters among these specific documents. The query I set is:

id:1234567 or id:1234567 or id:2345678 ...

However, when I have, say 200 specific documents, Solr complains that my query has too many boolean clauses. Should I simply increase the maxBooleanClauses or there should be another approach for this kind of query?

Community
  • 1
  • 1
user3390906
  • 147
  • 1
  • 3
  • 12
  • 1
    How dynamic is the list of ids? can it be pregenerated (meaning, can a document be tagged with "belongs to xyz") when the documents are inserted? How often do the group ownership change (meaning, can you reindex the documents necessary when the ownership changes)? – MatsLindh May 04 '15 at 09:04
  • The list of ids is very dynamic. In this second, I may be interested in these 1000 documents. In the next second, I may be interested in another 1000 documents. I appreciate your approach by using tag. But there could be a few concurrent searches going on to make the tag approach difficult. – user3390906 May 04 '15 at 10:18
  • Is the index distributed, or does it live on one server? You could try creating a temporary collection, index the ids as separate documents to that collection and then join against the original collection to retrieve the documents. – MatsLindh May 04 '15 at 13:22
  • The index is distributed to 4 cores. The collection idea is interesting. Wonder how quick can a core be created and processed. I want everthing to be completed within 3 seconds. – user3390906 May 05 '15 at 03:15
  • 1
    Distributed indices aren't joinable (as both cores have to live on the same server), so the easiest solution is probably to increase the number of boolean clauses. :-| – MatsLindh May 05 '15 at 07:33

1 Answers1

0

I had the same issue of too many boolean clauses exception in solr. You are having document ids, I had the acl ids(access control list)

Steps I have taken to overcome the issue is

  1. Increase the maxBooleanCLauses
  2. Used the in the filter query as &fq=acl:(id1 or id2...)
  3. If the count of acl is greater than 8000 then i am sending multiple request with the below logic

    StringBuilder queryString = new StringBuilder(5000); if (acls.length > 8000) { Integer aclCounter = 0; long requestCnt = Math.round(Math.ceil((acls.length / 8000.0))); for (int i = 0; i < requestCnt; i++) { queryString = new StringBuilder(acls.length * 10); queryString.append("&fq=acl:("); for (int j = 0; j < 8000 && aclCounter < acls.length; aclCounter++, j++) { queryString.append(acls[aclCounter]).append(APPEND_OR); } queryString.replace(queryString.length() - 4, queryString.length(), ")"); // build the solr request and sent it to get the response } }

Abhijit Bashetti
  • 8,518
  • 7
  • 35
  • 47