I have a string field in my document. Now I need to sort my documents based on the word counts of that field. How do I accomplish that in elasticsearch?
3 Answers
The best approach to this would be to use the token count type. But then we need to make sure that we are not disrupting the orginal string. For this , we need to use multi field and add additional field to keep track of the tokens alone.
Now a mapping like below should work best for us
{
"tweet" : {
"properties" : {
"name" : {
"type" : "multi_field",
"fields" : {
"wordCount" : {"type" : "token_count"},
}
}
}
}
}

- 18,633
- 8
- 63
- 77
-
Can you please answer this https://stackoverflow.com/questions/51590179/search-part-of-string-with-elasticseach?noredirect=1#comment90147346_51590534 – Vidya L Jul 30 '18 at 10:04
Use term aggregation like as :
curl -H GET http://loclahost:9200/index name/_search?pretty=1 -d'
{
"aggs": {
"genders": {
"terms": {
"field": "gender"
}
}
}
}'
Note : for curl command check this
Here search for field gender
and get result of all gender in aggregation bucket and default result is sorted order.

- 7,715
- 4
- 33
- 52
-
This works for single-word fields but fails when multiple tokens are present as each token is counted separately. `Hello world`, `Hello my name is dave` -> `Hello` x 2, `name` x 1, `dave` x 1, `world` x 1 (`my` and `is` may or may not be stripped out depending on the analyzer you use). – Basic May 06 '15 at 16:15
Your best bet is to store the token count alongside the original field. See the documentation in the Core Types here: http://www.elastic.co/guide/en/elasticsearch/reference/1.4/mapping-core-types.html#token_count
Then you would sort by field.word_count (where field is the 'parent' property).

- 56,243
- 7
- 59
- 69