7

I am trying to extract US only phone numbers from a string.

I have looked around the web/SO but have not found any suitable solution for my needs.

To be honest, I have 2.5 years of experience in Web Programming but I suck at RegEX.

Here is only RegEX I wrote (\d{3}+\-\d{3}+\-\d{4}+)

but it only detects 589-845-2889

Here are phone numbers I want to extract.

589-845-2889

(589)-845-2889

589.845.2889

589 845 2889

5898452889

(589) 845 2889

Please tell me how can I achieve this feat in single Regex for PHP.

EDIT:

If you feel any other format of US number a user can enter, also mention that as well, and include that in RegEX as well.

P.S:

Actually I am trying to scrape Craiglist and user may have posted their phone number in any possible format.

Umair Ayub
  • 19,358
  • 14
  • 72
  • 146

2 Answers2

2

In PHP (PCRE) you can use this regex based on conditional subpatterns:

(\()?\d{3}(?(1)\))[-.\h]?\d{3}[-.\h]?\d{4}

RegEx Demo

  • (\()? matches optional ( and captures it in group #1
  • (?(1)\)) is conditional pattern that matches closing ) only if group #1 is not null i.e. ( is present on left of the match.
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • Why it has weird result? http://sandbox.onlinephpfunctions.com/code/b3b6cf795b8026c85d5c9bbdc04943978e94e8d0 – Umair Ayub Jul 05 '16 at 11:29
  • I mean, it returns 2 arrays, 1st array is fine, what is 2nd array is all about? – Umair Ayub Jul 05 '16 at 11:31
  • You should be using `var_dump($result[0]);` for your results. [Check this demo now](http://sandbox.onlinephpfunctions.com/code/1cc85f77cb46becf28711e9162c8ea601d752375) – anubhava Jul 05 '16 at 11:32
  • 2nd array or `$result[1]` is showing first captured group which gets populated only when input starts with `(` – anubhava Jul 05 '16 at 11:33
  • 1
    +1 This has comments which will actually help the user going forward. Other answers are just solutions. Teach a man how to fish and all that. – webnoob Jul 05 '16 at 11:49
1

Finally, it works:

^(\((\d{3})\)|(\d{3}))[\s\-\.]?\d{3}[\s\-\.]?\d{4}

tested in notepad++

Michał M
  • 618
  • 5
  • 13