5

I need to get the number of documents in an index. not the documents themselves, but just this "how many" .

What's the best way to do that?

There is https://www.elastic.co/guide/en/elasticsearch/reference/current/search-count.html. but I'm looking to do this in Java.

There also is https://www.elastic.co/guide/en/elasticsearch/client/java-api/2.4/count.html, but it seems way old.

I can get all the documents in the given index and come up with "how many". But there must be a better way.

Yu Hao
  • 119,891
  • 44
  • 235
  • 294
ash__999
  • 161
  • 1
  • 1
  • 11

6 Answers6

15

Use the search API, but set it to return no documents and retrieve the count of hits from the SearchResponse object it returns.

For example:

import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.action.search.SearchType;
import org.elasticsearch.index.query.QueryBuilders.*;

SearchResponse response = client.prepareSearch("your_index_goes_here")
    .setTypes("YourTypeGoesHere")
    .setQuery(QueryBuilders.termQuery("some_field", "some_value"))
    .setSize(0) // Don't return any documents, we don't need them.
    .get();

SearchHits hits = response.getHits();
long hitsCount = hits.getTotalHits();
evanjd
  • 415
  • 7
  • 12
  • what is the type of `client`? – 14wml Jul 29 '18 at 23:21
  • @14wml TransportClient. See here: https://www.elastic.co/guide/en/elasticsearch/client/java-api/6.3/transport-client.html – evanjd Jul 30 '18 at 03:03
  • You might also want to take a look at this https://stackoverflow.com/questions/35252156/getting-count-and-list-of-ids-using-elasticsearchtemplate-in-spring-data-elastic – ScottSummers Sep 19 '18 at 11:43
4

Elastic - Indices Stats

Indices level stats provide statistics on different operations happening on an index. The API provides statistics on the index level scope (though most stats can also be retrieved using node level scope).

prepareStats(indexName) client.admin().indices().prepareStats(indexName).get().getTotal().getDocs().getCount();

Z.ABC
  • 91
  • 3
  • This frequently gives me a result that differs from what I expect. I have an integration test that deletes an index, then recreates it and loads 20 docs. The count is always higher than I'm expecting using this method, and it's usually a multiple of 20. – Matt Lachman Mar 26 '19 at 19:08
  • This method will count all docs (including nested ones) which will produce a high count. The `_count` API will only return the number of top level docs – Nate Jul 13 '22 at 12:55
4

Just an addition to @evanjd's answer

import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.action.search.SearchType;
import org.elasticsearch.index.query.QueryBuilders.*;

 SearchResponse response = client.prepareSearch("your_index_goes_here")
   .setTypes("YourTypeGoesHere")
   .setQuery(QueryBuilders.termQuery("some_field", "some_value"))
   .setSize(0) // Don't return any documents, we don't need them.
   .get();

 SearchHits hits = response.getHits();
 long hitsCount = hits.getTotalHits().value;

we need to add .value to get long value of total hits otherwise it will be a string value like "6 hits"

long hitsCount = hits.getTotalHits().value;

long hitsCount = hits.getTotalHits().value;

Akash Babu
  • 41
  • 2
3

Breaking changes after 7.0; you need to set track_total_hits to true explicitly in the search request.

https://www.elastic.co/guide/en/elasticsearch/reference/current/breaking-changes-7.0.html#track-total-hits-10000-default

enesaltinok
  • 55
  • 1
  • 7
1

We can also get lowLevelClient from highLevelClient and invoke the "_count" rest API like "GET /twitter/_doc/_count?q=user:kimchy".

Biplab
  • 139
  • 6
1

2021 Solution

I went through the solutions posted and none of them are convincing. You may get the job done by setting size of the search request to 0 but that's not the correct way. For counting purposes we should use the count API because count consumes less resources/bandwidth and it doesn't require to fetch documents, scoring and other internal optimisations.

You must use the Count API for Java (link attached below) to get the count of the documents. Following piece of code should get the job done.

  • Build query using QueryBuilder

  • Pass the query and list of indexes to the CountRequest() constructor

  • Get CountResponse() object by doing client.count(countReq)

  • Extract/Return the value by doing countResp.getCount()

    CountRequest countReq = new CountRequest(indexes, query);

    CountResponse countResp = client.count(countReq, RequestOptions.DEFAULT);

    return countResp.getCount();

Read the second link for more information.

Important Links

Count API vs Search API : Counting number of documents using Elasticsearch

Count API for Java : https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-count.html

wingman__7
  • 679
  • 13
  • 17