0

I am trying to build a query to find phoneNumber in MongoDB. for example I have those number in my database:

  • +33671974813

  • +33.6.71.97.48.13

  • +33 6 71 97 48 13

and I want to find them all in a query like

mobilePhone: { $regex: "+3367197", $options: "i" }

I already try that but it doesn't work:

mobilePhone: { $regex: "+3367197", $options: "i", $in: [/^\+?[\d\. ]+/] }

Do you know how to do ?

Thanks for help ;)

2 Answers2

1

I think you should clean the database from bad formatted phones. Also dont allow bad formatted phones to enter the database, clean them before insert.

The bellow query works in bad formated phones.
The first step is to clean the phone (keep only digits and the + symbol if in front). And then does the match you want.

You can use the first part that cleans the phone, to clear all the database, with a pipeline update, like updateMany({},[{"$set" ...}])

Also when you try to match you should escape the regex special characters, for example not {regex : "+33.."} but {regex : "\\+33"} like the example bellow.

For a function that does those automatically see this answer also

Query

Test code here

db.collection.aggregate([
  {
    "$set": {
      "phone": {
        "$reduce": {
          "input": {
            "$regexFindAll": {
              "input": "$phone",
              "regex": "^\\+|\\d+"
            }
          },
          "initialValue": "",
          "in": {
            "$concat": [
              "$$value",
              "$$this.match"
            ]
          }
        }
      }
    }
  },
  {
    "$match": {
      "$expr": {
        "$regexMatch": {
          "input": "$phone",
          "regex": "\\+3367197"
        }
      }
    }
  }
])
Takis
  • 8,314
  • 2
  • 14
  • 25
0

The .aggregate() solution is great, and will work! But afaik it will add a performance cost to your back-end query with MongoDB.

Another approach my colleague hipped me to is to use a "magic regex delimiter", in between each digit in the search value.

The delimiter: \D*

  • \D means "matches any single character that is not a digit (same as [^0-9] )."
  • * means "zero or more matches"

So, if the user is searching for 123567 and the db has 123-567 then you end up looking for the regex:

  • 1\D*2\D*3\D*5\D*6\D*7\D*

That way, any other characters that appear in between digits in the DB (parentheses, dashes, etc.) are ignored with the modified regex. And no additional aggregation needs to be done.

The actual code looked like this: new RegExp(search.replace(/\D/g, '').split('').join('\\D*'), 'i')

  • you can see that we first replace/remove all non-digits from the search value...

Hope that helps!

Tyler2P
  • 2,324
  • 26
  • 22
  • 31