3

I need to know the count of items in the ES index. Using the Search API, I can take the count from hit.total part of the response:

"total": {
  "value": 467,
  "relation": "eq"
}

It's limited on default to 10'000 hits, so when

"total": {
  "value": 10000,
  "relation": "gte"
}

I know two solutions to get the exact number of this count:

  1. Set track_total_hits=true for a request to the Search API. According to the documentation, it comes with the cost.

  2. Make another call to Count API.

Do you know which solution is better? In the first option, I'll make only a single HTTP call do the Elastic Search. In the second one, I need two HTTP calls. Do you know if the Count API is significantly better than track_total_hits=true flag?

tommy
  • 388
  • 2
  • 14

1 Answers1

3

track_total_hits is an optimization which doesn't affect queries which yield fewer than 10K docs. In other words, it wouldn't make your query which returns 467 docs any faster.

Now, as suggested in this answer, _count should be faster because there's no ranking and no expensive _source retrieval. So go with _count if you care about saving speed at every turn.

But keep in mind that _count doesn't support the from, size, or aggregations parameters — only the query body parameter or the lucene URL parameter q.

Joe - GMapsBook.com
  • 15,787
  • 4
  • 23
  • 68