0

Wanting to validate phone numbers with the following criteria.

-Minimum of 6 digits.

-Can only have the following symbols "+", "(", ")", "-".

-Contain no more than n consecutive symbols, but numbers are OK.

Here are some examples of what i consider valid:

07519767576
+447519767576
(02380) 346450
(+44) 7519767576

I have been trying to do this myself for quite a while but hitting a brick wall. Here is what i have tried so far

^(?=.{9,}$)(?=[^0-9]*[0-9])(?:([\d\s\+\(\)\-])\1?(?!\1{5}))+?$

This kinda works but its a bit of a hack because it also limits amount of consecutive numbers.

I am not able to do this check in PHP, it has to be done in JS sadly. Is this even possible without needing a degree in regex?

Nexidian
  • 89
  • 1
  • 7
  • 1
    Something like [this community regex](http://regexr.com/3dnp0)? – Emil S. Jørgensen Sep 09 '16 at 11:15
  • 1
    Can you add examples of numbers you'd like to match? It's hard to write regular expressions without actual test cases. – ffledgling Sep 09 '16 at 11:23
  • 1
    This is plain unclear. The criteria do not agree with the current pattern. Right now, even `233+345)2(` is considered valid. – Wiktor Stribiżew Sep 09 '16 at 11:31
  • @WiktorStribiżew Yes im aware of that, im not good at writing regex, it was the best i could come up with with my knowledge :) – Nexidian Sep 09 '16 at 11:39
  • @ffledgling Sorry! ive added examples – Nexidian Sep 09 '16 at 11:39
  • 2
    One example contains space, but it isn't one of the accepted symbols? – Huntro Sep 09 '16 at 11:44
  • Try [`^\+?(?:0|\(\+?\d+\))? ?\d+$`](https://regex101.com/r/yO3tH9/1) – Wiktor Stribiżew Sep 09 '16 at 11:45
  • 1
    Is the requirement that the total number of digits in the number has to be at least 6? And they can be clubbed/grouped using \(\) at anytime? i.e, is `12(34)56` a valid number? – ffledgling Sep 09 '16 at 11:54
  • @WiktorStribiżew just ran my unit test and it failed on a lot of numbers. "01522 5956593" "020 82221941" etc – Nexidian Sep 09 '16 at 11:55
  • @ffledgling Yes 6 is the minimum yes grouped at anytime. I have a sample of 1 million numbers so as generic as possible really. I know its a big ask, been trying this for a few hours now – Nexidian Sep 09 '16 at 11:56
  • 2
    @Nexidian You're trying to basically count characters across groups (or multiple regular expressions if you're using `|`) in this task then. This is not something regular languages let you do. PCRE *might* have some cryptic regular expression that lets you do this, but I do feel you're better off writing a small *javascript function to parse this by hand. It'll be much easier, will take less time and will likely be more maintainable in case a bug pops up later. – ffledgling Sep 09 '16 at 12:01
  • *just ran my unit test* - **Post everything that is related to your question in the question body**. – Wiktor Stribiżew Sep 09 '16 at 12:02
  • @WiktorStribiżew Apologies, i didn't think that the means of testing would change how the regex would need to work. Why would it matter if i was testing by hand or by batch? – Nexidian Sep 09 '16 at 12:05
  • @ffledgling Thank you, that is what i was afraid of. Sadly i cant write a small js function to handle this without drastically changing how the platform works. I think im going to have to settle for a broader approach instead of trying to craft a catch all – Nexidian Sep 09 '16 at 12:06
  • @Nexidian can you not write a small js function called `parsenumber(num_string)` that you can use instead of `match()` ? Do you have to pass a regular expression to something that does the parsing on your behalf? – ffledgling Sep 09 '16 at 12:12
  • @ffledgling Sadly not, the parsing is done in a Zend form, it accepts a regex string to test to the element, I cant change that up without change the MVC and i wont have permission or time to do that – Nexidian Sep 09 '16 at 12:18
  • Ok, here is a very generic regex: [`^(?=(?:\D*\d){6})(?!(?:[^+() ][+() ]){2})[\d+() ]+$`](https://regex101.com/r/mV1vB9/1) - 1) min 6 digits `(?=(?:\D*\d){6})`, 2) `+`, `(`, `)` or space cannot appear 2 in a row due to `(?!.*([+() ])\1)`, 3) string can contain one or more digits, `(`, `)`, `+` or spaces due to `[\d+() ]+`. – Wiktor Stribiżew Sep 09 '16 at 12:41
  • What does this mean? "Contain no more than *n* consecutive symbols, but numbers are OK." Where *n* is defined as what? – gfullam Sep 09 '16 at 12:57

1 Answers1

0

At least one of your requirements is beyond what traditional regular languages in general can do. As pointed out in the comments, counting the number of digits across patterns, groups or regular expressions is not possible in traditional regular languages, which essentially use Deterministic Finite Automata (also knows as DFAs) to compute regular expression matches.

PCRE compatible regular expressions, which is what most languages like Javascript and Python for example, support add additional functionality with things such as backtracking, look ahead matching, grouping, counting for a single group, and so on.

These enhance the set of patterns PCRE regular expressions can match, or more technically the set of languages the expression will accept. But to the best of my knowledge, none of these extensions let one do counting in the way you want to here, at least directly.

Turns out PCRE compatible regular expressions are NP-Complete in theory, but that doesn't mean it's easy or even feasible to write a regular expression for a given problem.

In most cases one would write a small hand rolled parser in a turing complete programming language, which can do what you need fairly easily.

OP mentioned that doing this is not an option and thus the problem as has come to a standstill.

ffledgling
  • 11,502
  • 8
  • 47
  • 69