3

when i do terms aggregation on string field (with whitespace tokenizer) i have results for each word (token), but i need results for whole strings. How can i do aggregation on string field like terms but group output by whole string, not by tokens?

I already saw this solutions: ElasticSearch term aggregation Terms aggregation based on unique key but they are based on keyword tokenizer

I can't use keyword tokenizer, because of i wan't apply stopwords filter while indexing

Community
  • 1
  • 1
Alex Sharov
  • 154
  • 1
  • 9

1 Answers1

3

I just had the same problem, and came here looking for a solution.

And then it dawned on me. there's a .raw (unanalyzed) field, and it worked. The solution was to use it.

So the aggregation went from:

{
  "aggs": {
    "keys": {
      "terms": {
        "size": 0,
        "field": "key"
}}}}}

to:

{
  "aggs": {
    "keys": {
      "terms": {
        "size": 0,
        "field": "key.raw"
}}}}}
Tony Laidig
  • 1,048
  • 2
  • 11
  • 33
  • This works only if you have set the fields as raw field. – Arun Mar 30 '16 at 09:41
  • Which, at least in the version of ELK stack that I'm running, is the default. – Tony Laidig Mar 30 '16 at 15:19
  • it worked for me without key.raw, I had to recreate the index, I guess my mapping was not defined earlier. I upvoted your answer. – Arun Apr 01 '16 at 06:05
  • Thanks. This is a bit confusing -- why you would need the .raw is that there might be keys broken up when analyzed. In my case, they are formatted "AAAA-BBB-CCC" . I am not able to query for them key using the analyzed field, but .raw does the trick. – Tony Laidig Apr 01 '16 at 14:42