1

I need my filter works like this:

18-24 | (16,635,890)
25-34 | (2,478,382)
35-44 | (1,129,493)
45-54 | (5,689,393)
55-64 | (4,585.933)

This is my ES mapping:

{
  "dynamic": "strict",
  "properties": {
    "birthdate": {
      "type": "date",
      "format": "m/d/yyyy"
    },
    "first_name": {
      "type": "keyword"
    },
    "last_name": {
      "type": "keyword"
    }
  }
}

I would like to know if it's possible to do this with this mapping. I'm not very experienced in ES, I believe that to do this I need advanced knowledge in ES.

Also, I tried to do this to test, but without any aggregation :/

age: {
    terms: {
       field: 'birthdate'
    }
}

--------------------
"doc_count_error_upper_bound" => 0,
  "sum_other_doc_count" => 0,
    "buckets" => [
                 {
                               "key" => 1072915200000,
                     "key_as_string" => "0/1/2004",
                         "doc_count" => 1
                 }
             ]
         },

I tried to read the documentation and search in some forums, but without success. thanks

1 Answers1

1

A good candidate for this would be the ranges aggregation but since your birthdate is formatted as a date, you'd need to calculate the age up until now before you proceeded to calculate the buckets. You can do so through a Painless script.

Putting it all together:

POST your-index/_search
{
  "size": 0,
  "aggs": {
    "price_ranges": {
      "range": {
        "script": {
          "source": "return (params.now_ms - doc['birthdate'].value.millis) / 1000 / 60 / 60 / 24 / 365",
          "params": {
            "now_ms": 1617958584396
          }
        }, 
        "ranges": [
          {
            "from": 18,
            "to": 24,
            "key": "18-24"
          },
          {
            "from": 25,
            "to": 34,
            "key": "25-34"
          }
        ]
      }
    }
  }
}

would return:

...
"aggregations" : {
  "price_ranges" : {
    "buckets" : [
      {
        "key" : "18-24",
        "from" : 18.0,
        "to" : 24.0,
        "doc_count" : 0
      },
      {
        "key" : "25-34",
        "from" : 25.0,
        "to" : 34.0,
        "doc_count" : 2
      }, 
      ...
    ]
  }
}

Note that the current timestamp wasn't obtained through a dynamic new Date() call but rather hardcoded as a parametrized now_ms variable. This is the recommended way of doing date math due to the distributed nature of Elasticsearch. For more info on this, check my answer to How to get current time as unix timestamp for script use.


Shameless plug: if you're relatively new to ES, you might find my recently released Elasicsearch Handbook useful. One of the chapters is dedicated solely to aggregations and one to Painless scripting!

Joe - GMapsBook.com
  • 15,787
  • 4
  • 23
  • 68