4

I have this users collection:

{
    "_id" : ObjectId("501faa18a34feb05890004f2"),
    "username" : "joanarocha",
}
{
    "_id" : ObjectId("501faa19a34feb05890005d3"),
    "username" : "cristianarodrigues",
}
{
    "_id" : ObjectId("501faa19a34feb05890006d8"),
    "username" : "anarocha",
}

When I query this: db.users.find({'username': /anaro/i}) results are sorted in natural order (insertion order).

I would like to sort them in a similarity search-term order. In this case results should return by this order:

{
    "_id" : ObjectId("501faa19a34feb05890006d8"),
    "username" : "anarocha",
}
{
    "_id" : ObjectId("501faa18a34feb05890004f2"),
    "username" : "joanarocha",
}
{
    "_id" : ObjectId("501faa19a34feb05890005d3"),
    "username" : "cristianarodrigues",
}
double-beep
  • 5,031
  • 17
  • 33
  • 41
Francisco Costa
  • 6,713
  • 5
  • 34
  • 43

2 Answers2

2

Unfortunately, MongoDB doesn't support full text search ranking by default.

First of all, you will need a algorithm to calculate the similarity between strings. See following links:

String similarity algorithims?

String similarity -> Levenshtein distance

Then you need to write a javascript function using the algorithm to compare two strings to pass it in your query. See the following link to see how to achieve that:

Mongo complex sorting?

Community
  • 1
  • 1
M. Mennan Kara
  • 10,072
  • 2
  • 35
  • 39
  • 6
    It should be noted that using JavaScript based querying, especially something this extensive, will be slow enough to disqualify this approach for any sort of mid to high concurrency production environment. – Remon van Vliet Aug 17 '12 at 12:58
0

One solution could be through $indexOfCP (aggregation)

Searches a string for an occurrence of a substring and returns the UTF-8 code point index (zero-based) of the first occurrence. If the substring is not found, returns -1.

db.testText1.aggregate([
  {
    $match: {
      username: { $regex: "anaro", $options: "i" }
    }
  },
  {
    $addFields: {
      relevance: {
        $indexOfCP: [ { $toLower: "$username" }, "anaro" ]  // query
      }
    }
  },
  {
    $sort: { relevance: 1 }  // sort by relevance
  },
  {
    $project: { relevance: 0 }  // remove relevance from results
  }
])
  • The $match stage uses regular expressions to match names.
  • The $addFields stage adds a new field called relevance to each matched document, which stores the starting position of "anaro" in the name field. If "anaro" is the beginning of the name field in the document, this value is 0, which is the best match.
  • The $sort stage sorts the matched documents based on the value of the relevance field to ensure that the documents closest to the search content are ranked first.
  • The $project stage removes the relevance field from the results because it is only a temporary field and we do not want it to appear in the final results.
zangw
  • 43,869
  • 19
  • 177
  • 214