3

I have a Regex which extracts German mobile phone numbers from a website:

[^\d]((\+49|0049|0)1[567]\d{1,2}([ \-/]*\d){7})(?!\d)

As you can see in the demo it works quite well. The only pattern which doesn't match yet is:

+49 915175461907

Please see more examples in the linked demo. The problem is the whitespace behind +49.

How do I need to change the current regex pattern in order to match even these kind of patterns?

martineau
  • 119,623
  • 25
  • 170
  • 301
PParker
  • 1,419
  • 2
  • 10
  • 25
  • 1
    A single optional space can be matched with space plus question mark. – Michael Butscher Oct 18 '21 at 13:55
  • 2
    Instead of the `[^\d]` your have at the start, use a negative look-behind: `(?<!\d)`. This way it will also match directly at the beginning of the string. Your current regex fails for this case. – Tomalak Oct 18 '21 at 14:04
  • 1
    Not correct dupe because problem is not just about allowing spaces. Negated character class `[^\d]` will cause it to not match if input starts with `+4915207829969` – anubhava Oct 26 '21 at 07:00

3 Answers3

5

A better regex would be:

(?<!\d)(?:\+49|0049|0) *[19][1567]\d{1,2}(?:[ /-]*\d){7,8}(?!\d)

Updated RegEx Demo

Changes:

  • (?<!\d): Make sure previous character is not a digit
  • [19][1567]: Match 1 or 9 followed by one of the [1567] digit
  • {7,8}: Match 7 or 8 repetitions of given construct
  • Better to keep an unescaped hyphen at first or last position in a character class
  • Avoid capturing text that you don't need by using non-capture group
anubhava
  • 761,203
  • 64
  • 569
  • 643
3

No brain method : removing space before regex.

Otherwise matching non withe space in regex is \s so (maybe too much parenthesis)

[^\d](((\+49|0049|0)([\s]{0,1})1)[567]\d{1,2}([ \-/]*\d){7})(?!\d)
Mayot
  • 49
  • 5
  • Thanks a lot. Works great except for the numbers which start have a ``9`` like this one: ``+49 915175461907`` – PParker Oct 18 '21 at 14:04
1

Add an optional white space:

[^\d]((\+49|0049|0)\s?(1|9)[1567]\d{1,2}([ \-/]*\d){7,8})(?!\d)

Update-Capturing beginning of line

If you want is to match numbers without them necessarily starting with a line break you can use this. It matches anything except digits before phone number:

 (^|[^\d])((\+49|0049|0)\s?(1|9)[1567]\d{1,2}([ \-/]*\d){7,8})(?!\d)

test it here

KZiovas
  • 3,491
  • 3
  • 26
  • 47