1

I'm attempting to extract phone numbers from messages submitted to me via a contact form. This "contact form" gets over a thousand submissions a day, and a lot of these submitters are making formatting mistakes, such as:

"Please call me back at925-943-2343 ext. 304" (no space before number)

Currently, my regex, which is below, is missing these numbers with certain formatting errors (such as the lack of space before the number)

foreach (...)
{
    $regex = '/^(?:1(?:[. -])?)?(?:\((?=\d{3}\)))?([2-9]\d{2})' 
        .'(?:(?<=\(\d{3})\))? ?(?:(?<=\d{3})[.-])?([2-9]\d{2})' 
        .'[. -]?(\d{4})(?: (?i:ext)\.? ?(\d{1,5}))?$/'; 
    if (preg_match($regex, $msg))
    {
        $phonenumber = preg_replace($regex, '($1) $2-$3 ext. $4', $msg); 
        echo $phonenumber;
    }
}

Any tips?

Related issue:

$regex = '/^(?:1(?:[. -])?)?(?:\((?=\d{3}\)))?([2-9]\d{2})' 
    .'(?:(?<=\(\d{3})\))? ?(?:(?<=\d{3})[.-])?([2-9]\d{2})' 
    .'[. -]?(\d{4})(?: (?i:ext)\.? ?(\d{1,5}))?$/'; 
$line = "(732) 912 0159 ";
if (preg_match($regex, $line))
{
    $phonenumber = preg_replace($regex, '($1) $2-$3 ext. $4', $line); 
    echo $phonenumber;
}

Why does this return nothing?

Alexander Cameron
  • 309
  • 1
  • 4
  • 11
  • 4
    Do you have access to modify the contact form? If so, adding a phone number text box would be a lot more robust than trying to find a perfect regex. – ean5533 Aug 18 '11 at 16:52
  • 2
    Add a phone number input field on the form that has a format mask like http://digitalbush.com/projects/masked-input-plugin/. It would have to validate the format server side too. I know that's not the answer your looking for but it addresses the root cause of the problem (e.g. extracting phone numbers) rather than patching the symptoms. – Jeff Aug 18 '11 at 16:52
  • I agree, usability-wise for you and the user the best solution seems to be to make a telephone # field for the users to input in. – Ben Brocka Aug 18 '11 at 16:58
  • Here is a [related question](http://stackoverflow.com/questions/123559/a-comprehensive-regex-for-phone-number-validation). – Chris Hepner Aug 18 '11 at 17:00
  • Don't use regex to parse the number. Just remove the non-digit characters and split it into chunks of digits. – zzzzBov Aug 18 '11 at 17:07
  • I do not have access to the contact form. Just the back-end. Bureaucracy. – Alexander Cameron Aug 18 '11 at 18:05

1 Answers1

0

Try this:

[01]?[- .]?\(?[2-9]\d{2}\)?[- .]?\d{3}[- .]?\d{4}(?:i.{0,3}x?.{0,9})(\d{1,5})
Ryan Gross
  • 6,423
  • 2
  • 32
  • 44
  • $line = "Please call me back at925-943-2343 ext. 304"; if (preg_match($regex, $line)) { echo preg_replace($regex, '($1) $2-$3 ext. $4', $line); } outputs nothing, I'm afraid (with your regex). – Alexander Cameron Aug 18 '11 at 18:03
  • Even just modifying it such that it will accept numbers with out a preceding space would be extremely beneficial, I think. – Alexander Cameron Aug 18 '11 at 18:12
  • Sorry, didn't realize that I had the `^` at the beginning. Having this character limits you to the beginning of a line when matching. – Ryan Gross Aug 19 '11 at 12:57