-1

I'm working in Python 3 and trying to figure out how to match a US telephone number as well as edge cases or common typos that might appear. I need to be able to handle a variety of different inputs and can't simply exclude an invalid number as long as it's 9 digits long. So far I've been writing each different scenario, but was wondering if there is a more simple or straight forward way of doing this. I'm also not sure whether there is a good way (or at least a standard way) of accounting for the possibility of white space. Here's what I have so far:

#Using regex to capture different phone number formats:
^[2-9]\d{2}-\d{3}-\d{4}$         #matches a phone number in the format ANN-NNN-NNNN, where A must be between 2 and 9 and N must be between 0 and 9.
^\([2-9]\d{2}\)-\d{3}-\d{4}$     #for (ANN)-NNN-NNNN
#Edge cases:
^\([2-9]\d{2}-\d{3}-\d{4}$       #for (ANN-NNN-NNNN 
^[2-9]\d{2}\)-\d{3}-\d{4}$       #for ANN)-NNN-NNNN
^[2-9]\d{2}-\d{3}\d{4}$          #for ANN-NNNNNNN 
^\([2-9]\d{2}\)-\d{3}\d{4}$      #for (ANN)-NNNNNNN
^[2-9]\d{2}\d{3}-\d{4}$          #for ANNNNN-NNNN
^\([2-9]\d{2}\)\d{3}-\d{4}$      #for (ANN)NNN-NNNN
^[2-9]\d{2}\d{3}\d{4}$           #for ANNNNNNNNN 
^\([2-9]\d{2}\)\d{3}\d{4}$       #for (ANN)NNNNNNN
David
  • 459
  • 5
  • 13

1 Answers1

0

The fix to include all edge cases is simple, just make ()- optional by adding ? after them:

test
# ['333-333-3333', '(333)-333-3333', '(333-333-3333', '333)-333-3333', '333-3333333', '(333)-3333333', '333333-3333', '(333)333-3333', '3333333333', '(333)3333333']

pattern = "^\(?[2-9]\d{2}\)?-?\d{3}-?\d{4}$"

import re
[True if re.match(pattern, x) else False for x in test]
# [True, True, True, True, True, True, True, True, True, True]
Psidom
  • 209,562
  • 33
  • 339
  • 356
  • wow this is really good, thank you so much. The only other issue is accounting for white space. So something like (333) 333 3333 or (333)-333 3333 – David Dec 15 '17 at 23:27