12

I need a well tested Regular Expression (.net style preferred), or some other simple bit of code that will parse a USA/CA phone number into component parts, so:

  • 3035551234122
  • 1-303-555-1234x122
  • (303)555-1234-122
  • 1 (303) 555 -1234-122

etc...

all parse into:

  • AreaCode: 303
  • Exchange: 555
  • Suffix: 1234
  • Extension: 122
AliciaBytes
  • 7,300
  • 6
  • 36
  • 47
Tristan Havelick
  • 67,400
  • 20
  • 54
  • 64

6 Answers6

21

None of the answers given so far was robust enough for me, so I continued looking for something better, and I found it:

Google's library for dealing with phone numbers

I hope it is also useful for you.

cheffe
  • 9,345
  • 2
  • 46
  • 57
mmoossen
  • 1,237
  • 3
  • 21
  • 32
  • +1 for good resource, -0.1 for not .NET (which was only a preference) Sadly we're doing integer math and it gets truncated to 0. – Mir Dec 12 '12 at 23:46
  • 2
    Can you show how this library was able to parse a phone number into its constituent parts? As the OP mentioned, into Area Code, Exchange, and Suffix? – TSmith Feb 11 '14 at 18:47
3

This is the one I use:

^(?:(?:[\+]?(?<CountryCode>[\d]{1,3}(?:[ ]+|[\-.])))?[(]?(?<AreaCode>[\d]{3})[\-/)]?(?:[ ]+)?)?(?<Number>[a-zA-Z2-9][a-zA-Z0-9 \-.]{6,})(?:(?:[ ]+|[xX]|(i:ext[\.]?)){1,2}(?<Ext>[\d]{1,5}))?$

I got it from RegexLib I believe.

Philip Rieck
  • 32,368
  • 11
  • 87
  • 99
  • 5
    That's horrible. My eyes are bleeding. – Paul Nathan Oct 22 '08 at 21:38
  • javascript doesn't have named groups, and it wasn't capturing the extension until I put a ? after the {6,} range. Wound up with: `/^(?:(?:[\+]?(\d{1,3}(?:\s+|[\-\.])))?[\(]?(\d{3})[\-\/)]?(?:\s+)?)?([a-zA-Z2-9][a-zA-Z0-9 \-\.]{6,}?)(?:(?:\s+|[xX]|(?:[Ee]xt[\.]?)){1,2}(\d{1,5}))?$/` – Jeff Lowery Aug 19 '14 at 20:03
1

Strip out anything that's not a digit first. Then all your examples reduce to:

/^1?(\d{3})(\d{3})(\d{4})(\d*)$/

To support all country codes is a little more complicated, but the same general rule applies.

Peter Stone
  • 3,756
  • 4
  • 23
  • 14
1

This regex works exactly as you want with your examples:

Regex regexObj = new Regex(@"\(?(?<AreaCode>[0-9]{3})\)?[-. ]?(?<Exchange>[0-9]{3})[-. ]*?(?<Suffix>[0-9]{4})[-. x]?(?<Extension>[0-9]{3})");
Match matchResult = regexObj.Match("1 (303) 555 -1234-122");

// Now you have the results in groups 
matchResult.Groups["AreaCode"];
matchResult.Groups["Exchange"];
matchResult.Groups["Suffix"];
matchResult.Groups["Extension"];
Christian C. Salvadó
  • 807,428
  • 183
  • 922
  • 838
  • Adding +1 into consideration `new Regex(@"1?\(?(?[0-9]{3})\)?[-. ]?(?[0-9]{3})[-. ]*?(?[0-9]{4})[-. x]?(?[0-9]{3})");` e.g. `+13035551234122` – Jaider Feb 21 '18 at 18:38
1

Here is a well-written library used with GeoIP for instance:

http://highway.to/geoip/numberparser.inc

ridoy
  • 6,274
  • 2
  • 29
  • 60
Ruslan Abuzant
  • 631
  • 6
  • 17
0

here's a method easier on the eyes provided by the Z Directory (vettrasoft.com), geared towards American phone numbers:

string_o s2, s1 = "888/872.7676";
z_fix_phone_number (s1, s2);
cout << s2.print();      // prints "+1 (888) 872-7676"
phone_number_o pho = s2;
pho.store_save();

the last line stores the number to database table "phone_number". column values: country_code = "1", area_code = "888", exchange = "872", etc.

gorth
  • 9
  • 3