5

i am having a small problem with my regex which i use to extract phone numbers from a strong

<?php
$output = "here 718-838-3586 there 1052202932 asdas dasdasd 800-308-4653 dasdasdasd 866-641-6949800-871-0999";
preg_match_all('/\b[0-9]{3}\s*[-]?\s*[0-9]{3}\s*[-]?\s*[0-9]{4}\b/',$output,$matches);
echo '<pre>';
print_r($matches[0]);
?>

output

Array
(
            [0] => 718-838-3586
            [1] => 1052202932
            [2] => 800-308-4653
            [3] => 866-641-6949
            [4] => 800-871-0999

)

this work fine but it returns 1052202932 as one of result which i don't need .
actually i don't know where is the missing part in my pattern .

Dr.Neo
  • 1,240
  • 4
  • 17
  • 28

2 Answers2

2

? in regex means {0,1} and you need exactly 1 occurence of '-' in your pattern

preg_match_all('/\b[0-9]{3}\s*-\s*[0-9]{3}\s*-\s*[0-9]{4}\b/',$output,$matches);

For more info http://www.php.net/manual/en/regexp.reference.repetition.php

Mark Basmayor
  • 2,529
  • 2
  • 16
  • 14
2

The ? after each [-] is making the - optional. If you want it to be required you can just remove the ? which will make it required. Also, [-] is equivalent to - so I got rid of the unnecessary character class:

preg_match_all('/\b[0-9]{3}\s*-\s*[0-9]{3}\s*-\s*[0-9]{4}\b/',$output,$matches);

You can also replace all of the [0-9] with \d to shorten it a bit further:

preg_match_all('/\b\d{3}\s*-\s*\d{3}\s*-\s*\d{4}\b/',$output,$matches);
Andrew Clark
  • 202,379
  • 35
  • 273
  • 306