1

Related to question How to select distinct field values using Solr? but what I want is the counter.

I want to know how many distinct elements there is in a Solr field. I could get this number by using:

group.field=my_field&group.ngroups=true&group.limit=0

but doing grouping for this seems like an overkill

Is there another way ? Do I have to use JIRA SOLR-1814 ?

Community
  • 1
  • 1
Bob Yoplait
  • 2,421
  • 1
  • 23
  • 35
  • See my answer for the related question: http://stackoverflow.com/a/26714447/621690 Use the StatsComponenet to retrieve a list of distinct values for a certain field. – Risadinha Nov 03 '14 at 12:36

2 Answers2

1

If you are looking for unique values in the fields. You can facet on the field name with the field type string, which will return all the unique values for the field (and the counts which may or may not be relevant to you).

The following patch @ https://issues.apache.org/jira/browse/SOLR-2242 will help you to get the count directly. If you can't use the patch, you would probably need to get all the values for the facet field and count by yourself.

Jayendra
  • 52,349
  • 4
  • 80
  • 90
  • OK, that's option #3 that implies faceting computation. Option #1 (group.ngroups) implies grouping computation. Option #2 (SOLR-1814) doesn't incur as much computation if my understanding is correct. Do you agree ? – Bob Yoplait Sep 17 '11 at 15:16
  • Its correct for option #1 and #3. #1 would have caching benefits. Haven't checked in detail for SOLR-1814, but it seems to be the same feature. You may want to check the patch and the applicability with your search and performance. – Jayendra Sep 17 '11 at 17:48
0

SOLR 3.3 and newer already accept group facetting

So if you simply apply a facet.query it should return the number of rows for your query. But I don't know of any other way to count your groups if you want to make such a query. I doubt that is possible any other way.

  • Which facet.query are you thinking of ? If I do facet.query=my_field:[* TO *], it will return a counter but it is not for distinct values, it is the total number of lines – Bob Yoplait Sep 17 '11 at 15:01