1

I tried the solution for phone numbers with 7-12 digits that may contain spaces or hypens in the following link. The first and last character has to be a number.

Regular expression to match 7-12 digits; may contain space or hyphen

However, i'm not understanding the regex well.

$phone_pattern="/^\d(?:[-\s]?\d){6,11}$/";

what does the ":" mean here?

How is this regex able to exclude the hypens and spaces from the 6 to 11 characters?

Help help is highly appreciated

Community
  • 1
  • 1
vaanipala
  • 1,261
  • 7
  • 36
  • 63
  • What exactly do you need help with? You have a solution, do you need an explanation of it? Or do you want to match something different. – Mirko Adari Oct 03 '12 at 08:12

3 Answers3

4

The : is part of the (?: ... ) - which means "non-capturing group" - it groups content but does not create a backreference to it (i.e. $1, $2, etc) like normal grouping does.

In that regex it will match from 6 up to 11 characters, including the heiphens and spaces - it meaning something like 12-------34 would match. I suggest using a more strict pattern:

/^\d{7,12}$/

This will only match the digits. To allow for heiphens and spaces with this match, but only get the number you want, you can use it like this:

<?php
$pattern = '/^\d{7,12}$/';
$string = '123-456 789';
$ignoreCharacters = array(' ', '-');

preg_match($pattern, str_replace($ignoreCharacters, $string);
LeonardChallis
  • 7,759
  • 6
  • 45
  • 76
  • I have no idea on what is a non-capturing group. I guess I have to read up a lot. Thanks for the great explanation! – vaanipala Oct 03 '12 at 10:56
  • 1
    You use brackets to capture matches and place them in to "callbacks" - $1, $1, etc. If you want to use brackets without putting them in to callbacks you use (?: ) instead - it groups the same, just doesn't put the reference in to $1, $2, $3 etc :) – LeonardChallis Oct 03 '12 at 14:10
  • Can you tell me what is the purpose of backreference and non-capturing group? This seems to be advanced regex for me. I'm clueless. I only understand simple regex. Please help. Thank you. – vaanipala Oct 04 '12 at 04:53
  • I found this great link http://www.regular-expressions.info/brackets.html that i'm reading up now. I will let u know if I have doubt. – vaanipala Oct 04 '12 at 04:59
  • If you want to apply repetition to a group of rules - i.e to optionally (zero or one times) match a string "Hello" with any number of digits after it, you could first write the regex to match the string: `Hello\d+`, then wrap it in a group so you can add the zero or one repetition `?` character: `(Hello\d+)?`. If you didn't have the brackets, it would only apply the `?` on the `\d+` before it, *not* the whole thing, so you use the brackets to group the content. Grouping with `( .. )` creates a callback (in our case, $1), but if we didn't want to create that $1 backreference we can use `(?: )` – LeonardChallis Oct 04 '12 at 07:16
3

It's understandable how that can be confusing. The (?: ... ) actually denotes a "non-capturing group," as opposed to the ( ... ), which is a "capturing group". If you're only testing strings against regexes, not capturing substrings, then the two are effectively the same for your purposes.

It doesn't help that there also exist (?= ... ), (?! ... ), (?<= ... ), (?<! ... ), and (?<foo> ... ), which all mean different things, too.

A lot to learn, but rewarding for sure!

Andrew Cheong
  • 29,362
  • 15
  • 90
  • 145
2

Have you tried a testing engine like regexpal (there are others available also) I frequently use this to test various strings against expressions to make sure they are behaving as expected.

My understanding is that in this case the : is not acting alone it is acting in conjunction with the ?

The ? in this circumstance does not mean Preceding zero or one times it is a modifier meaning give this group a new meaning which in conjunction with the modifier : turn off capture means you want this to be a non capturing group expression.

The effect that this has is that when placing part of an expression inside () it by default causes capture but ?: switches this off.

Thus (?:[-\s]?\d) becomes a non capturing group expression.

Note captured groups are used with back references and most regex engines support up to 9 back references.

So removing capture speeds up the matching process and allows you to elect not to refer back to that group saving one of your 9 references for a group that you really do want to refer back to.

codepuppy
  • 1,130
  • 2
  • 15
  • 25
  • Ok, i got it. Does that mean space and hypen is not going to be included when verifying the regular expression? only numbers will be verified? – vaanipala Oct 03 '12 at 11:02
  • @vaanipala No the group is still applied in the matching process. But you are unable to back reference that group. – codepuppy Oct 03 '12 at 12:55