1

I run aggregation that on 2 indices: idx-2020-07-21, idx-2020-07-22 The target: Get all documents, but in the case of duplicate id (50% are), get the one from the latest index using the index name.

This is the query I'm running

{
  "size": 0,
  "aggregations": {
    "latest_item": {
      "composite": {
        "size": 1000,
        "sources": [
          {
            "product": {
              "terms": {
                "field": "_id",
                "missing_bucket": false,
                "order": "asc"
              }
            }
          }
        ]
      },
      "aggregations": {
        "max_date": {
          "top_hits": {
            "from": 0,
            "size": 1,
            "version": false,
            "explain": false,
            "sort": [
              {
                "_index": {
                  "order": "desc"
                }
              }
            ]
          }
        }
      }
    }
  }
}

Each index size is 8G with ~1M docs. ES version 7.5

and it takes around 8Min to aggregate, most of the times I get

{"error":{"root_cause":[{"type":"circuit_breaking_exception","reason":"[parent] Data too large, data for [<http_request>] would be [32933676058/30.6gb], which is larger than the limit of [32641751449/30.3gb].
  1. Is there a better way to write this query?
  2. How do I deal with this exception?
  3. I run a java job that query ES every 10 min, I noticed it happened a lot in the second time, do I need to release any resources or something? I use restHighLevelClient.searchAsync() with a listener that call again with the next key until I get null.

The cluster has 3 nodes, 32G each.

I tries to play with the bucket size it didn't help a lot.

Thanks!

  • https://stackoverflow.com/questions/60075253/elasticsearch-7-x-circuit-breaker-data-too-large-troubleshoot/60082126#60082126 AND/OR https://stackoverflow.com/questions/60121024/understanding-elasticsearch-circuit-breaking-exception/60125353#60125353 should help you understand the circuit_breaking_exception – ibexit Aug 06 '20 at 22:26

0 Answers0