0

I am trying to implement a weighted partial text search using Mongo and Spring. My Mongo documents are structured like this:

{
    "_id" : ObjectId("5947d610659f8e614887cbc9"),
    "_class" : "co.ecg.alpaca.model.SearchIndexEntry",
    "type" : "GroupAccessDevice",
    "deviceId" : "Bogus_Device",
    "devicesName" : "Bogus Device",
    "properties" : {
        "deviceType" : "Polycom VVX 500",
        "netAddress" : "",
        "macAddress" : "000111222111",
        "serviceProviderId" : "Bogus",
        "availablePorts" : "12",
        "groupId" : "Bogus_Group",
        "version" : ""
    },
    "tags" : [
        {
            "tag" : "Bogus_Device",
            "score" : 10
        },
        {
            "tag" : "Bogus Device",
            "score" : 9
        },
        {
            "tag" : "000111222111",
            "score" : 7
        },
        {
            "tag" : "Bogus_Group",
            "score" : 3
        },
        {
            "tag" : "Bogus",
            "score" : 3
        }
    ],
    "createdBy" : "ALPACA_SYSTEM",
    "createdDate" : ISODate("2017-06-19T13:48:00.473Z"),
    "lastModifiedBy" : "ALPACA_SYSTEM",
    "lastModifiedDate" : ISODate("2017-06-19T13:48:00.473Z"),
    "cluster" : DBRef("broadworks_cluster", ObjectId("5947d60a659f8e614887cb1a")),
    "parent" : DBRef("search_index", ObjectId("5947d610659f8e614887cbb7"))
}

What I want to do is use a partial regex search against tag.name and then sort them by the tag.score multiplied by the Levenshtein distance between the regex and tag.name. My question is, is this possible to do with one Mongo query, maybe some kind of aggregation?

Azzabi Haythem
  • 2,318
  • 7
  • 26
  • 32
dkelley
  • 13
  • 1
  • 3
  • You won't be able to do Levenshtein distance in the aggregation framework, at least not with the `kitten -> sitten` kind of case since the permutations are far to complex. The best you could do here is match on `Bogus` as a regex, and then only after `$unwind`, then maybe even look at the string length with [`$strLenCP`](https://docs.mongodb.com/manual/reference/operator/aggregation/strLenCP/). But you need MongoDB 3.4 for the operator. – Neil Lunn Jun 20 '17 at 04:01
  • Related: [How to query MongoDB with “like”?](https://stackoverflow.com/q/3305561/2313887). And [Retrieve only the queried element in an object array in MongoDB collection](https://stackoverflow.com/q/3985214/2313887) – Neil Lunn Jun 20 '17 at 04:02

1 Answers1

0

As far as I'm aware, it's not possible to do this in one query. See the accepted answer here for an explanation of why you can't use an external function (i.e. a function to calculate the Leveninshtein distance between two strings) within a mongo query.

The following query for example will return the documents you want:

db.getCollection('test').aggregate([{$match: {tags: {$elemMatch: {tag: {$regex: 'Bogus'}}}}}])

You will then need to sort the tags array yourself, in memory.

Andrew Winterbotham
  • 1,000
  • 7
  • 13