I would like to be able to query for text but also retrieve only the results with the maximum value of a certain integer field in my data. I have read the docs about aggregations and filters and I don't quite see what I am looking for.
For instance, I have some repeating data that gets indexed that is the same except for an integer field - let's call this field lastseen
.
So, as an example, given this data put into elasticsearch:
// these two the same except "lastseen" field
curl -XPOST localhost:9200/myindex/myobject -d '{
"field1": "dinner carrot potato broccoli",
"field2": "something here",
"lastseen": 1000
}'
curl -XPOST localhost:9200/myindex/myobject -d '{
"field1": "dinner carrot potato broccoli",
"field2": "something here",
"somevalue": 100
}'
# and these two the same except "lastseen" field
curl -XPOST localhost:9200/myindex/myobject -d '{
"field1": "fish chicken something",
"field2": "dinner",
"lastseen": 2000
}'
curl -XPOST localhost:9200/myindex/myobject -d '{
"field1": "fish chicken something",
"field2": "dinner",
"lastseen": 200
}'
If I query for "dinner"
curl -XPOST localhost:9200/myindex -d '{
"query": {
"query_string": {
"query": "dinner"
}
}
}'
I'll get 4 results back. I'd like to have a filter such that I only get two results back - only the items with the maximum lastseen
field.
This is obviously not right, but hopefully it gives you an idea of what I am after:
{
"query": {
"query_string": {
"query": "dinner"
}
},
"filter": {
"max": "lastseen"
}
}
The results would look something like:
"hits": [
{
...
"_source": {
"field1": "dinner carrot potato broccoli",
"field2": "something here",
"lastseen": 1000
}
},
{
...
"_source": {
"field1": "fish chicken something",
"field2": "dinner",
"lastseen": 2000
}
}
]
update 1: I tried creating a mapping that excluded lastseen
from being indexed. This did not work. Still getting all 4 results back.
curl -XPOST localhost:9200/myindex -d '{
"mappings": {
"myobject": {
"properties": {
"lastseen": {
"type": "long",
"store": "yes",
"include_in_all": false
}
}
}
}
}'
update 2: I tried a deduplication with the agg scheme listed here, and it did not work, but more importantly, I don't see a way to combine that with a keyword search.