2

I am trying to aggregate on a field and get the top records using top_ hits but I want to include other fields in the response which are not included in the nested property mapping. Currently if I specify _source:{"include":[]}, I am able to get only the fields which are in the current nested property.

Here is my mapping

{
  "my_cart":{
    "mappings":{
      "properties":{
        "store":{
          "properties":{
            "name":{
              "type":"keyword"
            }
          }
        },
        "sales":{
          "type":"nested",
          "properties":{
            "Price":{
              "type":"float"
            },
            "Time":{
              "type":"date"
            },
            "product":{
              "properties":{
                "name":{
                  "type":"text",
                  "fields":{
                    "keyword":{
                      "type":"keyword"
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}
  

UPDATE

Joe's answer solved my above issue.

My current issue in response is that though I am getting the product name as "key" and other details, But I am getting other product names as well in the hits which were part of that transaction in the billing receipt. I want to aggregate on the product's name and find last sold date of each product along with other details such as price,quantity, etc .

Current Response

"aggregations" : {
    "aggregate_by_most_sold_product" : {
      "doc_count" : 2878592,
      "all_products" : {
        "buckets" : [
          {
            "key" : "shampoo",
            "doc_count" : 1,
            "lastSold" : {
              "value" : 1.602569793E12,
              "value_as_string" : "2018-10-13T06:16:33.000Z"
            },
            "using_reverse_nested" : {
              "doc_count" : 1,
              "latest product" : {
                "hits" : {
                  "total" : {
                    "value" : 1,
                    "relation" : "eq"
                  },
                  "max_score" : 0.0,
                  "hits" : [
                    {
                      "_index" : "my_cart",
                      "_type" : "_doc",
                      "_id" : "36303258-9r7w-4b3e-ba3d-fhds7cfec7aa",
                      "_source" : {
                        "cashier" : {
                          "firstname" : "romeo",
                          "uuid" : "2828dhd-0911-7229-a4f8-8ab80dde86a6"
                        },
                       "product_price": {
                       "price":20,
                       "discount_offered":10
                        },

                        "sales" : [
                          {
                            "product" : {
                              "name" : "shampoo",
                               "time":"2018-10-13T04:44:26+00:00
                            },
                             "product" : {
                              "name" : "noodles",
                              "time":"2018-10-13T04:42:26+00:00
                            },
                              "product" : {
                              "name" : "biscuits",
                              "time":"2018-10-13T04:41:26+00:00
                            }
                            }
                            ]
                              }
                             }
                            ]
                             }
}
]


Expected Response

It gives me all product name's in that transaction which is increasing the bucket size. I only want single product name with the last date sold along with other details for each product.

My aggregation is same as Joe's aggregation in answer

Also my doubt is that can I also add scripts to perform actions on fields which I got in _source.

Ex:- price-discount_offered = Final amount.

1 Answers1

3

The nested context does not have access to the parent unless you use reverse_nested. In that case, however, you've lost the ability to only retrieve the applicable nested subdocument. But there is luckily a way to sort a terms aggregation by the result of a different, numeric one:

GET my_cart/_search
{
  "size": 0,
  "aggs": {
    "aggregate": {
      "nested": {
        "path": "sales"
      },
      "aggs": {
        "all_products": {
          "terms": {
            "field": "sales.product.name.keyword",
            "size": 6500,
            "order": {                                <--
              "lowest_date": "asc"
            }
          },
          "aggs": {
            "lowest_date": {                          <--
              "min": {
                "field": "sales.Time"
              }
            },
            "using_reverse_nested": {
              "reverse_nested": {},                   <--
              "aggs": {
                "latest product": {
                  "top_hits": {
                    "_source": {
                      "includes": [
                        "store.name"
                      ]
                    },
                    "size": 1
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

The caveat is that you won't be getting the store.name inside of the top_hits -- though I suspect you're probably already doing some post-processing on the client side where you could combine those entries:

"aggregate" : {
  ...
  "all_products" : {
    ...
    "buckets" : [
      {
        "key" : "myproduct",                     <--
        ...
        "using_reverse_nested" : {
          ...
          "latest product" : {
            "hits" : {
              ...
              "hits" : [
                {
                  ...
                  "_source" : {
                    "store" : {
                      "name" : "mystore"         <--
                    }
                  }
                }
              ]
            }
          }
        },
        "lowest_date" : {
          "value" : 1.4200704E12,
          "value_as_string" : "2015/01/01"       <--
        }
      }
    ]
  }
}
Joe - GMapsBook.com
  • 15,787
  • 4
  • 23
  • 68
  • thanks a lot for taking out time to answer, your approach is perfect but I am still getting inaccurate data in response. The thing is I have records of all bills and I want to find out the latest transaction for each store. In one bill there can be multiple records of different stores. And currently its aggregating bill wise last transaction and not each last transaction of some store out of all the bills. – Harvindar Singh Garcha Oct 12 '20 at 16:06
  • 1
    No prob. Can you update your question and describe that problem more clearly (with examples)? – Joe - GMapsBook.com Oct 12 '20 at 16:32
  • I have updated the question summary, please check and let me know. – Harvindar Singh Garcha Oct 13 '20 at 09:24
  • I think you've change my query from the answer because I specified to only include `store.name`, yet your hits include the whole `_source`. – Joe - GMapsBook.com Oct 13 '20 at 09:50
  • Yes Joe, I did change it as I need another fields also in the response such as cashier name, etc. Just to make question short I didn't include another properties of mapping. – Harvindar Singh Garcha Oct 13 '20 at 10:11
  • I used `script_fields` instead of `_source` `include` and it solved my issue, Thanks for the help man. – Harvindar Singh Garcha Oct 13 '20 at 14:55
  • @JoeSorocin if you have a parent and a nested object, if you make a top hit aggregation on the child, how can you have both attributes of parents and childs ? I used reversed_nested, but now the problem is it's giving me all the childs of the parent. – misterone Feb 16 '21 at 19:37
  • @HarvindarSinghGarcha how did you use script_fields to achieve it ? – misterone Feb 16 '21 at 19:39
  • @misterone Not sure what you mean. Can you post a separate question and tag me? – Joe - GMapsBook.com Feb 16 '21 at 20:10
  • @JoeSorocin Thanks a lot. I just posted here : https://stackoverflow.com/questions/66232699/get-parent-data-and-nested-child-in-reverse-nested-aggregation-top-hits – misterone Feb 16 '21 at 21:53