Phrase "find this phrase" is too common for documents in your index. Essentially every document matches this search query and little differences in relevance are due to the field-length norm. As far as I know field-length norm is computed per shard. So when each of three documents of your index is located in its own shard you can see slightly surprising search results where relevance of the document with the shortest field is lower than others. You can test it by creating the index with only one primary shard. In that case document with field value "find this phrase" will get the highest score. Also you can achive the same result for several primary shards by disabling field-length norm:
PUT your_index/_mapping/your_type
{
"properties": {
"FieldToSearch": {
"type": "text",
"norms": false
}
}
}
But I think more accurate queries would be a better choise.
EDIT:
My point is just using more specific queries which contain relatively unique tokens. For example instead of querying phrase Jurassic Park
that is contained in almost every document in your index it would be better to query World Jurassic Park
that is contained in only one document.
However, there is a way to achieve the desired results for your example. Look at this question. You will need to change mapping to enable token counter on certain fields:
PUT your_index/_mapping/your_type
{
"properties": {
"FieldToSearch": {
"type": "text",
"fields": {
"length": {
"type": "token_count",
"analyzer": "standard"
}
}
}
}
}
Then use function_score
to boost relevance depending on the count of token that field contains:
GET your_index/your_type/_search
{
"query": {
"function_score": {
"query": {"match_phrase": {
"title": "Jurassic Park"
}},
"field_value_factor": {
"field": "FieldToSearch.length",
"modifier": "reciprocal"
}
}
}
}
This way the documents with fields containing small number of tokens will get the higher score.