0

Trying to get results for an elasticsearch query that are filtered by a max priority value based on a unique group id.

The difference is that my query already has aggregations and I want to limit the aggregation counts to prune them based on the max priority value within each unique group. I want to do this by adding the group/priority filter to the query filter.

The group value is, for example, ABC123.
So in my sample data below I have 2 items in 2 groups. My query then filters by product (GreatProduct) and language (English) and gets aggregations based on Color.
Currently I get 4 results back and an aggregation count of 4 based on color of Red.
So applying the unique group and max priority to the filter, I would expect to get back 2 items, one for each unique group with each item having a priority of 2 (max value).

My current solution is based on this post - How to make an elasticsearch query that filters on the maximum value of a field?. But the above solution, applies the max value/unique field, to the aggregations whereas I would like to apply this to the filter and have the aggregation counts reflect the filtering by max value/unique field.

Sample mapping:

{
   "myindex-2016.07.07": {
      "mappings": {
         "myindex": {
            "properties": {
               "title_en": {
                  "type": "string",
                  "store": true,
                  "analyzer": "english"
               },
               "description_en": {
                  "type": "string",
                  "store": true,
                  "analyzer": "english"
               },
               "id": {
                  "type": "long",
                  "store": true,
                  "include_in_all": false
               },
               "group": {
                  "type": "string",
                  "index": "not_analyzed",
                  "store": true,
                  "include_in_all": false
               },
               "priority_value": {
                  "type": "long",
                  "store": true,
                  "include_in_all": false
               },
               "product": {
                  "type": "string",
                  "index": "not_analyzed",
                  "store": true,
                  "include_in_all": false
               },
                  "language": {
              "type": "string",
              "index": "not_analyzed",
              "store": true,
              "include_in_all": false
           },
           "color": {
              "type": "string",
              "index": "not_analyzed",
              "store": true,
              "include_in_all": false
           }
            }
         }
      }
   }
}

Sample data:

  # first two items have same group id of "ABC123"
  curl -XPOST localhost:9200/myindex/myitem -d '{
    "id": 1,
    "title_en": "Title1",
    "desc_en": "Desc1",
    "group": "ABC123"
    "priority_value": 2
    "lang": "English"
    "product": "GreatProduct" 
    "color": "Red" 
  }'

  curl -XPOST localhost:9200/myindex/myitem -d '{
    "id": 2,
    "title_en": "Title2",
    "desc_en": "Desc2",
    "group": "ABC123"
    "priority_value": 1
    "lang": "English"
    "product": "GreatProduct" 
    "color": "Red" 
  }'

  # next two items have same group id of "XYZ789"
  curl -XPOST localhost:9200/myindex/myitem -d '{
    "id": 3,
    "title_en": "Title3",
    "desc_en": "Desc3",
    "group": "XYZ789"
    "priority_value": 1
    "lang": "English"
    "product": "GreatProduct" 
    "color": "Red" 
  }'

  curl -XPOST localhost:9200/myindex/myitem -d '{
    "id": 4,
    "title_en": "Title4",
    "desc_en": "Desc4",
    "group": "XYZ789"
    "priority_value": 2
    "lang": "English"
    "product": "GreatProduct" 
    "color": "Red" 
  }'
Community
  • 1
  • 1
striker77
  • 544
  • 1
  • 6
  • 17
  • You need two queries for this: first get the maximum then run your aggregation. And I don't think there is any magic in ES even if this would have been possible in one query only. ES would have executed two queries anyway, the only benefit would have been the missing network roundtrip. – Andrei Stefan Jul 12 '16 at 10:03
  • Hi @AndreiStefan thanks for feedback. Need to avoid the network roundtrip and perform this in a single query. Even a post-filter would do if that applied. – striker77 Jul 18 '16 at 12:10

0 Answers0