4

How can I get the total number of documents matching the given query. I have use the query below:

result = solr.search('ad_id : 20')
print(len(result))

Since the default returning value is '10', the output is only 10, but the count is 4000. How can I get the total number of counts?

Manish Ojha
  • 515
  • 1
  • 8
  • 26

5 Answers5

4

The results object from pysolr has a hits property that contains the total number of hits, regardless of how many documents being returned. This is named numFound in the raw response from Solr.

Your solution isn't really suitable for anything with a larger dataset, since it requires you to retrieve all the documents, even if you don't need them or want to show their content.

MatsLindh
  • 49,529
  • 4
  • 53
  • 84
4

The count is stored in numFound variable. Use the code below:

result = solr.search('ad_id : 20')
print(result.raw_response['response']['numFound'])
  • How does this differ from the 'hits' property available on the result? There is no need to use the raw response. – MatsLindh Dec 09 '17 at 20:36
1

As @MatsLindh mentioned -

result = solr.search('ad_id : 20')
print(result.hits)
zx485
  • 28,498
  • 28
  • 50
  • 59
1

If you just want the total number of items that satisfy your query, here is my Python3 code (using the pysolr module):

    collection='bookindex'  # or whatever your collection is called
    solr_url = f"http://{SOLR_HOST}/solr/{collection}"
    solr = pysolr.Solr(url=solr_url, timeout=120, always_commit=True)
    result = solr.search("*:*", rows=0);
    return result.hits

This queries for all documents (":") -- 315913 in my case -- but you can narrow that to suit your requirements. For example, if I want to know how many of my book entries have title:pandas I can search("title:pandas", rows=0) and get 41 as the number that have pandas in the title. By setting rows=0 you're letting Solr know that it need not format any results for you but you just return the meta information, and thus much more efficient than setting a high limit on rows.

Steve L
  • 1,523
  • 3
  • 17
  • 24
0

Finally got the answer:

Added rows=1000000 at the end of the query.

result = solr.search('ad_id : 20', rows=1000000)

But if the rows are greater than this the number should be changed in the query. This might be a bad solution but works. If anyone has a better solution please do reply.

Manish Ojha
  • 515
  • 1
  • 8
  • 26