1

I need help writing regular experssion that will match phone numbers in some data. Somin the following format:

XXX-XXX-XXXX
XXX-XXX-XXXX Ext. XXX
XXX-XXX-XXXX Ext. XXXX
(XXX) XXX-XXXX
XXX-XXX-XXXX (Text)
XXX.XXX.XXXX

and one regexp specifically for this line:

XXX-XXX-XXXX (Text & Special Chars.); XXX-XXX-XXXX (Text)

I've searched through several questions and tried the general 0-9 match but to no avail so far. I also tried this more recently to fetch normal phone numbers:

preg_match_all('/^((\(\d{3}\) ?)|(\d{3}-))?\d{3}-\d{4}$/', $a3, $n);

print_r($n);


Can someone show me the right way? Thanks

Tower
  • 1,287
  • 3
  • 15
  • 25
  • 2
    http://stackoverflow.com/search?q=%5Bregex%5D+phone – Mike B Feb 20 '12 at 21:49
  • I did search, many results applied to that users specific result, even with a bit a tweaking. – Tower Feb 20 '12 at 21:51
  • there are a lot more phone number formats than that in the *world* –  Feb 20 '12 at 22:01
  • 1
    Phone numbers for what country? Those won't much any number in mine, for example – Damien Pirsy Feb 20 '12 at 22:03
  • Phone numbers in a page of data I have. Not a telephone directory or anything. I am trying to extract just the telephone numbers and those that have specific text following it. – Tower Feb 20 '12 at 22:07
  • @Dagon i've analyzed the data myself, and recorded all of the formats in my set of data, which is why i presented it like this. – Tower Feb 20 '12 at 22:14
  • 1
    possible duplicate of [Regular expression for phone numbers](http://stackoverflow.com/questions/4868328/regular-expression-for-phone-numbers) – Ryan Feb 20 '12 at 22:16
  • Regular expressions like these are simple to write. Just use a tool like this one: http://gskinner.com/RegExr/ – Chris Laplante Feb 20 '12 at 22:43

1 Answers1

2

This pattern will match the examples you provided in any text.

((?:\d{3}[.-]|\(\d{3}\)\s+)\d{3}[.-]\d{4})(?:\s+Ext\.\s+(\d+)|\s+\(.*?\))?

If you can provide some sample data or explain in more detail what the subject looks like the pattern can likely be improved. For example, are all the numbers on a separate lines (like your pattern would suggest) or are they part of a text?


Example script, and the pattern explained:

$string = <<<EOT
123-456-7890

    123-456-7890 Ext. 321
123-456-7890 Ext. 4321
(123) 456-7890
lorem ipsum
123-456-7890 (Text)
123.456.7890
foo bar baz
EOT;

//preg_match_all("/((?:\d{3}[.-]|\(\d{3}\)\s+)\d{3}[.-]\d{4})(?:\s+Ext\.\s+(\d+)|\s+\(.*?\))?/i", $string, $matches);
preg_match_all("/
               (                        # open capturing group to hold the phone number
                   (?:                  # open non-capturing group
                       \d{3}[.-]        # match 3 digits followed by a . or -
                       |                # OR, if the previous part of this group did not match
                       \(\d{3}\)\s+     # match 3 digits between parentheses folowed by one or more spaces
                   )                    # close non-capturing group
                   \d{3}[.-]            # match 3 digits followed by a . or -
                   \d{4}                # match 4 digits
               )                        # close capturing group
               (?:                      # open non-capturing group
                   \s+Ext\.\s+          # one or more spaces followed by Ext. followed by one or more spaces
                   (                    # open capturing group to hold the extension number
                       \d+              # match one or more digits
                   )                    # close capturing group
                   |                    # OR, if the previous part of this non-capturing group did not match
                   \s+\(.*?\)           # one or more spaces and then anything, except newline, between (the smallest) pair of parentheses
               )?                       # close non-capturing group, the ? makes the whole non-capturing group optional
               /ix", $string, $matches);# the i flag makes the matches case insensitive and x allows for the comments and spacing

echo "<pre>\n";
print_r($matches);

output:

Array
(
    [0] => Array
        (
            [0] => 123-456-7890
            [1] => 123-456-7890 Ext. 321
            [2] => 123-456-7890 Ext. 4321
            [3] => (123) 456-7890
            [4] => 123-456-7890 (Text)
            [5] => 123.456.7890
        )

    [1] => Array
        (
            [0] => 123-456-7890
            [1] => 123-456-7890
            [2] => 123-456-7890
            [3] => (123) 456-7890
            [4] => 123-456-7890
            [5] => 123.456.7890
        )

    [2] => Array
        (
            [0] => 
            [1] => 321
            [2] => 4321
            [3] => 
            [4] => 
            [5] => 
        )

)

and for the other line, the pattern above will also match that one, but if you do need it here it is

$string = "123-456-7890 (Text & Special Chars.); 123-456-7890 (Text)";

preg_match_all("/(\d{3}[.-]\d{3}[.-]\d{4})\s+\(.*\);\s+(\d{3}[.-]\d{3}[.-]\d{4})\s+\(.*?\)/i", $string, $matches);

echo "<pre>\n";
print_r($matches);

output:

Array
(
    [0] => Array
        (
            [0] => 123-456-7890 (Text & Special Chars.); 123-456-7890 (Text)
        )

    [1] => Array
        (
            [0] => 123-456-7890
        )

    [2] => Array
        (
            [0] => 123-456-7890
        )

)
Robjong
  • 375
  • 1
  • 6