I've done a complex query using the popularity to improve the results of social media documents using Elasticsearch. The query works really fine and the top results are always centered on the query and with interesting elements.
However it has a problem, for some queries the first results are all from the same user.
I would like to downscore a document if same user was retrieved on a higher document. This way I expect to have more diversification on the results.
Note that I don't want them to be removed, as in some cases it may still be interesting to find more documents of the same user, but I would like them to be in a lower position.
Can anybody suggest a way to make it work?
As suggested in some comments I update a (simplified version) of my query:
query = {"function_score": {
"functions": [
{"gauss": {"createdAt":
{"origin": "now", "scale": "30d", "offset": "7d", "decay" :0.9 }
}},
{"gauss": {"shares.last.twitter_retweets_log":
{"origin": 4.52, "scale": 2.61, "decay" : 0.9}
}},
],
"query": {"bool":{"must":[
{"exists":{"field": "images"}},
{"multi_match":{"query": "foo boo", fields:["text", "link.title"]}}
]}},
"score_mode": "multiply"
}};
P.S: some documents that may be interesting, as they talk about diversity, but I'm not sure how to apply: