3

note: I'm using Mongodb 4 and I must use aggregation, because this is a step of a bigger aggregation

Problem

How to find in a collection documents that contains fields that starts with value from another field in same document ?

Let's start with this collection:

db.regextest.insert([
{"first":"Pizza", "second" : "Pizza"},
{"first":"Pizza", "second" : "not pizza"},
{"first":"Pizza", "second" : "not pizza"}
])

and an example query for exact match:

db.regextest.aggregate([
{
    $match :  { $expr: { $eq: [ "$first" ,"$second" ] }   }    }
])

I will get a single document

{
    "_id" : ObjectId("5c49d44329ea754dc48b5ace"),
    "first" : "Pizza",    "second" : "Pizza"
}

And this is good.

But how to do the same, but with startsWith ? My plan was to use regex but seems that is not supported in aggregation so far.

With a find and a custom javascript function works fine:

db.regextest.find().forEach(
    function(obj){
        if (obj.first.startsWith(obj.second)){
            print(obj);
        }
    }
)

And returns correctly:

{
    "_id" : ObjectId("5c49d44329ea754dc48b5ace"),
    "first" : "Pizza",
    "second" : "Pizza"
}

How it's possible to get same result with aggregation framework ?

One idea is to use existing aggregation framework pipeline, out to a temp colletion and then run the find above, to get match I'm looking for. This seems to be a workaround, I hope someone have a better idea.

Edit: here the solution

db.regextest.aggregate([{
        $project : {
            "first" : 1,
            "second" : 1,
            fieldExists : {
                $indexOfBytes : ['$first', '$second' , 0]
            }
        }
    }, {
        $match : {
            fieldExists : {
                $gt : -1
            }
        }
    }
]);
B--rian
  • 5,578
  • 10
  • 38
  • 89
Vokail
  • 630
  • 8
  • 27
  • 2
    You can use [`$indexOfCP`](https://stackoverflow.com/questions/54249830/how-can-i-use-a-field-from-aggregate-in-a-regex-match-in-mongodb/54249973#54249973) – Ashh Jan 25 '19 at 12:32
  • thanks for suggestions! I've found my answer here: https://stackoverflow.com/questions/3305561/how-to-query-mongodb-with-like?rq=1#comment86038455_48083774 – Vokail Jan 28 '19 at 12:47
  • 1
    @AsyaKamsky: Could you be so kind and explain your solution a bit, please? Why is it better? – B--rian Jul 24 '19 at 08:56
  • 2
    @B--rian sure - it uses aggregation expression (via $expr) referring to as "f1" and "f2" field names (called in the question "first" and "second") to check for equality between two strings: "$f2" and substring of f1 from position 0 which is size of f2 string). So if I have f1: "pizza is good" and f2: "pizza" then substring of of f1 from position 0 that's 5 characters is 'pizza' and it's a match. Come to think of it, I don't think this is a duplicate of marked question, I'll see about getting it unduped and adding the answer. – Asya Kamsky Jul 28 '19 at 15:51
  • 2
    as far as why it's better - the request was for a $match and it does a $match in a single stage without adding any superfluous fields to the document. Not to mention the fact that $project loses the rest of the fields except _id. – Asya Kamsky Jul 28 '19 at 15:55

1 Answers1

1

The simplest way is to use $expr, first available in 3.6 like this:

{
  $match: {
    $expr: {
      $eq: [
        '$second',
        {
          $substr: ['$first', 0, { $strLenCP: '$second' }]
        }
      ]
    }
  }
}

This compares the string in field second with the first N characters of first where N is the length of second string. If they are equal, then first starts with second.

4.2 adds support for $regex in aggregation expressions, but starts with is much simpler and doesn't need regular expressions.

bflemi3
  • 6,698
  • 20
  • 88
  • 155
Asya Kamsky
  • 41,784
  • 5
  • 109
  • 133