ElasticSearch terms aggregation on tokenized field

Question

when i do terms aggregation on string field (with whitespace tokenizer) i have results for each word (token), but i need results for whole strings. How can i do aggregation on string field like terms but group output by whole string, not by tokens?

I already saw this solutions: ElasticSearch term aggregation Terms aggregation based on unique key but they are based on keyword tokenizer

I can't use keyword tokenizer, because of i wan't apply stopwords filter while indexing

Post what you already tried (mapping, queries, data samples, expectations). — Andrei Stefan, Jan 07 '15 at 09:16

score 3 · Answer 1 · answered Aug 26 '15 at 20:46

3

I just had the same problem, and came here looking for a solution.

And then it dawned on me. there's a .raw (unanalyzed) field, and it worked. The solution was to use it.

So the aggregation went from:

{
  "aggs": {
    "keys": {
      "terms": {
        "size": 0,
        "field": "key"
}}}}}

to:

{
  "aggs": {
    "keys": {
      "terms": {
        "size": 0,
        "field": "key.raw"
}}}}}

answered Aug 26 '15 at 20:46

Tony Laidig

1,048
2
11
33

This works only if you have set the fields as raw field. – Arun Mar 30 '16 at 09:41
Which, at least in the version of ELK stack that I'm running, is the default. – Tony Laidig Mar 30 '16 at 15:19
it worked for me without key.raw, I had to recreate the index, I guess my mapping was not defined earlier. I upvoted your answer. – Arun Apr 01 '16 at 06:05
Thanks. This is a bit confusing -- why you would need the .raw is that there might be keys broken up when analyzed. In my case, they are formatted "AAAA-BBB-CCC" . I am not able to query for them key using the analyzed field, but .raw does the trick. – Tony Laidig Apr 01 '16 at 14:42

ElasticSearch terms aggregation on tokenized field

1 Answers1