Mongodb self-defined comparison to filter

Question

My data looks like:

{max:3.4.6, min: 1.10.2}
{max:10.9.12, min:6.90.1}

the number 3.5.6 is the stored as string, and 0.10.0 should be greater than 0.9.0. I want to give another number cur, and return all results within cur in between.

find all document where min < cur < max

Can I define a comparison function which can recognize 1.10.2 < 2.1.3 < 3.4.6? And use it to do query?

My understand is @Pao wants to query by a "number" and find out those ranges to include the "number". The "number" is combined by 3 parts, the compare logic should compare the left part, if it is the same, then compare the 2nd part, then 3rd part.... — yellowB, Aug 11 '16 at 11:10

DAXaholic · Accepted Answer · 2016-08-11T07:33:02.857

There is also a solution with the aggregation framework which you could use but it requires a current release of mongoDB as it uses the $split and $strLenBytes operator. I created it on version 3.3.10.

The idea is essentially based on of padding your version parts with leading zeros up to a fixed size (in my example each part can be from 000 to 999 but you can adjust that if needed) and do the same for the document fields min and max so that one can compare the strings.
So for sample data like this

/* 1 */
{
    "_id" : ObjectId("57ac0c264ae6fbd5fb5b6e97"),
    "max" : "3.4.6",
    "min" : "1.10.2"
}

/* 2 */
{
    "_id" : ObjectId("57ac0c264ae6fbd5fb5b6e98"),
    "max" : "10.9.12",
    "min" : "6.90.1"
}

/* 3 */
{
    "_id" : ObjectId("57ac0c264ae6fbd5fb5b6e99"),
    "max" : "3.5.9",
    "min" : "0.5.7"
}

/* 4 */
{
    "_id" : ObjectId("57ac0c264ae6fbd5fb5b6e9a"),
    "max" : "2.0.0",
    "min" : "1.5.0"
}

the pipeline

var cur = "1.11.1";
var curParts = cur.split('.');
var curPadded = ("00" + curParts[0]).slice(-3) + "." +
                ("00" + curParts[1]).slice(-3) + "." +
                ("00" + curParts[2]).slice(-3);

db.getCollection('minmax').aggregate([
    {
        $project: {
            min: 1, max: 1,
            maxTmp: {
                $let: {
                    vars: {
                        maxParts: { $split: ["$max", "."] }
                    },
                    in: {
                       major: { $arrayElemAt: ["$$maxParts", 0] },
                       majorLen: { $strLenBytes: { $arrayElemAt: ["$$maxParts", 0] } },
                       minor: { $arrayElemAt: ["$$maxParts", 1] },
                       minorLen: { $strLenBytes: { $arrayElemAt: ["$$maxParts", 1] } },
                       patch: { $arrayElemAt: ["$$maxParts", 2] },
                       patchLen: { $strLenBytes: { $arrayElemAt: ["$$maxParts", 2] } }
                    }
                }
            },
            minTmp: {
                $let: {
                    vars: {
                        minParts: { $split: ["$min", "."] }
                    },
                    in: {
                       major: { $arrayElemAt: ["$$minParts", 0] },
                       majorLen: { $strLenBytes: { $arrayElemAt: ["$$minParts", 0] } },
                       minor: { $arrayElemAt: ["$$minParts", 1] },
                       minorLen: { $strLenBytes: { $arrayElemAt: ["$$minParts", 1] } },
                       patch: { $arrayElemAt: ["$$minParts", 2] },
                       patchLen: { $strLenBytes: { $arrayElemAt: ["$$minParts", 2] } }
                    }
                }
            }            
        }
    },
    {
        $project: {
            min: 1, max: 1,
            maxMajor: { $substr: [{ $concat: ["_00", "$maxTmp.major"] }, "$maxTmp.majorLen", 3] },
            maxMinor: { $substr: [{ $concat: ["_00", "$maxTmp.minor"] }, "$maxTmp.minorLen", 3] },
            maxPatch: { $substr: [{ $concat: ["_00", "$maxTmp.patch"] }, "$maxTmp.patchLen", 3] },
            minMajor: { $substr: [{ $concat: ["_00", "$minTmp.major"] }, "$minTmp.majorLen", 3] },
            minMinor: { $substr: [{ $concat: ["_00", "$minTmp.minor"] }, "$minTmp.minorLen", 3] },
            minPatch: { $substr: [{ $concat: ["_00", "$minTmp.patch"] }, "$minTmp.patchLen", 3] },            
        }
    },
    {
        $project: {
            min: 1, max: 1,
            maxPadded: { $concat: ["$maxMajor", ".", "$maxMinor", ".", "$maxPatch"] },
            minPadded: { $concat: ["$minMajor", ".", "$minMinor", ".", "$minPatch"] }
        }
    },
    {
        $match: {
            maxPadded: { $gt: curPadded },
            minPadded: { $lt: curPadded }
        }
    },
    {
        $project: {
           min: 1,
           max: 1
        }
    }
])

will produce the output

/* 1 */
{
    "_id" : ObjectId("57ac0c264ae6fbd5fb5b6e97"),
    "max" : "3.4.6",
    "min" : "1.10.2"
}

/* 2 */
{
    "_id" : ObjectId("57ac0c264ae6fbd5fb5b6e99"),
    "max" : "3.5.9",
    "min" : "0.5.7"
}

/* 3 */
{
    "_id" : ObjectId("57ac0c264ae6fbd5fb5b6e9a"),
    "max" : "2.0.0",
    "min" : "1.5.0"
}

Explanation

The first stage splits the parts of the string by '.' and determines the length of those strings so that one can trim the strings at the next stage to a fixed size.

$project: {
    min: 1, max: 1,
    maxTmp: {
        $let: {
            vars: {
                maxParts: { $split: ["$max", "."] }
            },
            in: {
                major: { $arrayElemAt: ["$$maxParts", 0] },
                majorLen: { $strLenBytes: { $arrayElemAt: ["$$maxParts", 0] } },
                minor: { $arrayElemAt: ["$$maxParts", 1] },
                minorLen: { $strLenBytes: { $arrayElemAt: ["$$maxParts", 1] } },
                patch: { $arrayElemAt: ["$$maxParts", 2] },
                patchLen: { $strLenBytes: { $arrayElemAt: ["$$maxParts", 2] } }
            }
        }
    },
    minTmp: ... // same idea
}

Documents in the pipeline look now like this

/* 1 */
{
    "_id" : ObjectId("57ac0c264ae6fbd5fb5b6e97"),
    "max" : "3.4.6",
    "min" : "1.10.2",
    "maxTmp" : {
        "major" : "3",
        "majorLen" : 1,
        "minor" : "4",
        "minorLen" : 1,
        "patch" : "6",
        "patchLen" : 1
    }
}
... // others

As previously mentioned, the various parts must now be padded and trimmed to a fixed size

$project: {
    min: 1, max: 1,
    maxMajor: { $substr: [{ $concat: ["_00", "$maxTmp.major"] }, "$maxTmp.majorLen", 3] },
    maxMinor: { $substr: [{ $concat: ["_00", "$maxTmp.minor"] }, "$maxTmp.minorLen", 3] },
    maxPatch: { $substr: [{ $concat: ["_00", "$maxTmp.patch"] }, "$maxTmp.patchLen", 3] },
    minMajor: ... // same idea
}

so that the documents now look like this

/* 1 */
{
    "_id" : ObjectId("57ac0c264ae6fbd5fb5b6e97"),
    "max" : "3.4.6",
    "min" : "1.10.2",
    "maxMajor" : "003",
    "maxMinor" : "004",
    "maxPatch" : "006",
    "minMajor" : "001",
    "minMinor" : "010",
    "minPatch" : "002"
}
... // others

Before matching them against your filter one has to concatenate them via

$project: {
    min: 1, max: 1,
    maxPadded: { $concat: ["$maxMajor", ".", "$maxMinor", ".", "$maxPatch"] },
    minPadded: ... // same idea
}

which produces

/* 1 */
{
    "_id" : ObjectId("57ac0c264ae6fbd5fb5b6e97"),
    "max" : "3.4.6",
    "min" : "1.10.2",
    "maxPadded" : "003.004.006",
    "minPadded" : "001.010.002"
}
... // others

The final two stages then do the actual matching and reshape the documents into the original state

$match: {
    maxPadded: { $gt: curPadded },
    minPadded: { $lt: curPadded }
},
$project: {
    min: 1,
    max: 1
}

Last notes: The pipeline could probably be shorter by 'compressing' the $project stages into one but I guess then it would be really difficult to follow along therefore I split the $project stages into several ones.

@Pao.. please accept this answer as my answer is wrong, I cant delete it since you have marked it as accepted.. — Naeem Shaikh, Apr 12 '17 at 06:59

yellowB · Answer 2 · 2016-08-11T11:12:02.450

You can use $where query operator in query statement like this, It accepts a function:

db.test.find({
    $where: function(){
        var cur = '7.1.3';
        return compare(this.max, cur) > 0 &&compare(this.min, cur) < 0 ;   

        function compare(base, cur) {
            var baseArr = base.split('.');
            var curArr = cur.split('.');

            //Compare from head to tail
            for(var i in baseArr) {
                if(i <= baseArr.length - 1) {
                    if(parseInt(baseArr[i]) < parseInt(cur[i])) return -1;
                    else if(parseInt(baseArr[i]) > parseInt(cur[i])) return 1;
                    else {
                        continue;
                    }
                }
                else {
                    return 0;
                }
            }
        }
    }

});

But this way can not use indexes, pls pay attention on the performance

Mongodb self-defined comparison to filter

2 Answers2