0

I wanted to see how many unique link that a user has posted for every user. Here is what I have come up so far

s.aggs.bucket('user_term', A('terms', field='user__id')).metric('url_count', A('value_count', field='link'))

However, I have yet found a way to iterate through that result. Is there a way for this?

Minh Triet
  • 1,190
  • 1
  • 15
  • 35

1 Answers1

1

This will not give you a unique count, just a number of docs with a value for that field, you want to use a cardinality instead:

s.aggs.bucket('users', 'terms', field='user.id').metric('url_count', 'cardinality', field='link')

r = s.execute()

for user in r.aggregations.users.buckets:
    print(f'User {user.key} posted {user.url_count.value} links')

Hope this helps

Honza Král
  • 2,982
  • 14
  • 11
  • Is it also possible to add an `cardinality` example to the documentation of `elasticseach_dsl`? I could do a pull request if you don't mind – Minh Triet Sep 03 '19 at 07:33
  • Also, this only shows 10 results. I am doing `s=[:]` but met `TransportError(502)`. https://stackoverflow.com/questions/47219846/elasticsearch-dsl-aggregations-returns-only-10-results-how-to-change-this does not seem to work – Minh Triet Sep 03 '19 at 08:36
  • `terms` aggregation returns top 10 buckets by default, you can specify `size=50` in the `bucket()` call but it will become more expensive for large numbers. If you want to get everything out, consider `composite` aggregation, see example: https://github.com/elastic/elasticsearch-dsl-py/blob/master/examples/composite_agg.py – Honza Král Sep 03 '19 at 11:58