2

Lets say I want to find the documents with the field "tags" that contain tags: "a", "b", "c".

If I use $and operator, it will only return the documents where "tags" contain all three tags.

Such a strict search is not what I want. If I choose to use $or operator, it will find docs that contain at least one tag of the list, but it won't try to check whether there are docs that contain several or all of them first.

What I want to do is to search docs that contain "as much tags as possible, but at least one", or in other words, find all the docs that contain at least one tag, but show the ones that have most matches first. I know that I could do this by doing a series of queries (e.g., use $and query and then $or), but if there are more that 2 tags, I'll have to make lots of queries with different combinations of tags to achieve good results.

JohnnyHK
  • 305,182
  • 66
  • 621
  • 471
A.V. Arno
  • 513
  • 2
  • 5
  • 12
  • 2
    Please consider accepting answers to your questions if they help resolve your problems. You have not accepted any of your question's answers. – BatScream Feb 16 '15 at 19:52

1 Answers1

7

You can aggregate the result as below:

  • $match all the documents which have at least 1 match.
  • $project a variable weight which holds the count of the number of matching tags that the document contains. To find the matching tags, use the $setIntersection operator.
  • $sort by the weight in descending order.
  • $project the required fields.

sample data:

db.t.insert([{"tags":["a","b","c"]},
{"tags":["a"]},
{"tags":["a","b"]},
{"tags":["a","b","c","d"]}])

search criteria:

var search = ["a","b"];

code:

db.t.aggregate([
{$match:{"tags":{$in:search}}},
{$project:{"weight":{$size:{$setIntersection:["$tags",search]}},
                            "tags":"$tags"}},
{$sort:{"weight":-1}},
{$project:{"tags":1}}
])

o/p:

{
        "_id" : ObjectId("54e23b74c6185de718484948"),
        "tags" : [
                "a",
                "b",
                "c"
        ]
}
{ "_id" : ObjectId("54e23b74c6185de71848494a"), "tags" : [ "a", "b" ] }
{
        "_id" : ObjectId("54e23b74c6185de71848494b"),
        "tags" : [
                "a",
                "b",
                "c",
                "d"
        ]
}
{ "_id" : ObjectId("54e23b74c6185de718484949"), "tags" : [ "a" ] }
BatScream
  • 19,260
  • 4
  • 52
  • 68