0

I tried to write the python code to get all the phone number from a string, but it still miss some number from them, could you please help assist ?

(+85)90 678 2842  
090 3156 374
090.315.6374
0903 164 567
0903 16 45 67
0903.16.45.67
+85 903 164 567
+85 90 316 4567 
+85(90)3 164567

my regex is using as below:

\\d{2,4}[ ,.]?\\d{2,4}[ ,.]?\\d{2,4}\

it missed some phone number begining with +85 or (+85)

Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
thangvc91
  • 323
  • 2
  • 10
  • 1
    What localities do you have to support ? It's quite complicated to have a regex supporting all possible phone number format in the world but if you specify which localities you are targetting is could be simpler to solve. – Peterrabbit Feb 22 '22 at 09:35
  • 1
    Your regex includes neither ``+`` nor ``(`` ``)``. Why *would* you expect it to match input with these characters? Do you use ``search`` or ``match`` or something different? Is the first ``d`` actually missing the ``\``? – MisterMiyagi Feb 22 '22 at 09:35
  • hi @MisterMiyagi , thank you , I have update my regex. Yes i use module re , and function search to extract all the phone number format from the raw file. So do you have any idea to match with the missing format ? – thangvc91 Feb 22 '22 at 09:40
  • 1
    Does this answer your question? [How to validate phone numbers using regex](https://stackoverflow.com/questions/123559/how-to-validate-phone-numbers-using-regex) – Jay Feb 22 '22 at 09:46

1 Answers1

0

The pattern is a little hard to figure out particularly because of the sheer number of variations. With that being said, although inelegant the following regex works, it could stand to benefit from backreferencing groups (to make the expression easier to read) and other optimizations but the best way to improve (the regex) would be to actually define the various possible patterns for a phone number in the question including the constraints.

Let me know if this works.

import re
ptrn = re.compile("^(\(\+\d{2}\)\d{2}|\d{2,4}|\+\d{2,4}(\(\d{2}\))?\d?)([. ]\d{2,6})(([. ]\d{2,6})?)(([. ]\d{2,6})?)")
for phone_number in phone_numbers:
    mtch = re.match(ptrn, phone_number)
    if mtch:
        print(f"Matched ! - {mtch.group()} for {phone_number}")
Dhiwakar Ravikumar
  • 1,983
  • 2
  • 21
  • 36