1

Not able to figured it out what's wrong with my regex. I'm trying to find following pattern from my string with the use of regex.

1231231234
123.123.1234
123-123-1234
123 123 1234

Here is my regex pattern:

\b((\d{3}[ .-]?){2}(\d{4}))\b

It detect incorrect result and the output i get it with the regex is like below. My aim is to not detect the highlighted once.

image1

What i tried so far but that is also not worked for me. image2

Anyone has idea how can i get this out?

kirtan
  • 289
  • 4
  • 13

2 Answers2

2

If all you need is to prevent the matched text from being preceded by a dot, you can achieve that by simply using a negative Lookbehind (i.e., (?<!\.)) instead of the negated character class. You may also add a negative lookahead at the end to prevent the match from being followed by a dot:

\b(?<!\.)((\d{3}[ .-]?){2}(\d{4}))(?!\.)\b

Demo.

Note, however, that this (as well as your original pattern) will match numbers separated by different characters (e.g., "123.123-1234"). If you'd like to prevent that, you may use something like:

\b(?<!\.)\d{3}([ .-]?)\d{3}\1\d{4}(?!\.)\b

..and add additional capturing groups as you see fit.

Demo.

1

You can use

(?<!\d\.?)\d{3}([ .-]?)\d{3}\1\d{4}(?!\.?\d)

See the .NET regex demo. If you need to make it work at the regex101, choose the JavaScript flavor that now supports variable width lookbehinds, or simply replace the first lookbehind with (?<!\d)(?<!\d\.):

(?<!\d)(?<!\d\.)\d{3}([ .-]?)\d{3}\1\d{4}(?!\.?\d)

C# snippet:

var results = Regex.Matches(text, @"(?<!\d\.?)\d{3}([ .-]?)\d{3}\1\d{4}(?!\.?\d)", RegexOptions.ECMAScript)
        .Cast<Match>()
        .Select(x => x.Value)
        .ToList();

Since \d in .NET matches any Unicode digits, you might consider compiling the regex with the RegexOptions.ECMAScript option.

Pattern details

  • (?<!\d\.?) - a negative lookbehind that matches a location not immediately preceded with a digit or digit and a dot
  • \d{3} - three digits
  • ([ .-]?) - Group 1: space, dot or hyphen
  • \d{3} - three digits
  • \1 - same value as in Group 1
  • \d{4} - four digits
  • (?!\.?\d) - a location in string that is not immediately followed with dot or dot and digit.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563