45

If all I have is a string of 10 or more digits, how can I format this as a phone number?

Some trivial examples:

555-5555
555-555-5555
1-800-555-5555

I know those aren't the only ways to format them, and it's very likely I'll leave things out if I do it myself. Is there a python library or a standard way of formatting phone numbers?

Joe
  • 16,328
  • 12
  • 61
  • 75
  • 5
    what range might they come from? Different countries have different conventions for formatting phone numbers. – Thomas K Aug 14 '11 at 16:45
  • 4
    Seconded. Please do not write code that assumes every phone number to be in US format. It is really irritating to try to work with programs like that. – snap Aug 14 '11 at 16:51
  • 1
    It seems the standard way to write phone numbers is called E.123. So a national number looks like `(800) 555 5555`, and an international number looks like `+1 800 555 5555`. But don't forget the lengths of the different groups varies by country. http://en.wikipedia.org/wiki/E.123 – Thomas K Aug 14 '11 at 17:05
  • 3
    @Thomas Right, this is exactly why I'm asking whether there is a library to do this. Its easy to make the wrong assumptions. If the format depends on the region, then perhaps that should be an argument or setting in the library. – Joe Aug 15 '11 at 00:09
  • The region can be inferred from the number *if* it includes the country code (the US country code is 1). – Keith Thompson Aug 15 '11 at 07:16
  • How considerate of you—I hope you get a product management job in Silicon Valley some day. It's amazing and endlessly frustrating how many apps and websites assume everyone understands that bloody confusing US date format that almost no other country uses. And don't even get me started about Fahrenheit and other weird units that got abandoned nearly everywhere else generations ago... – Michael Scheper Dec 22 '16 at 02:18

7 Answers7

61

for library: phonenumbers (pypi, source)

Python version of Google's common library for parsing, formatting, storing and validating international phone numbers.

The readme is insufficient, but I found the code well documented.

kusut
  • 1,610
  • 14
  • 24
  • 33
    For those looking for the quick answer, here is a sample with a US number. `pip install phonenumbers` then ```import phonenumbers phonenumbers.format_number(phonenumbers.parse("8006397663", 'US'), phonenumbers.PhoneNumberFormat.NATIONAL)``` – Shane Reustle Jan 09 '15 at 07:50
  • 5
    How to use custom format in this library? I need {CountryCode}-\d{3}-{remaining_numbers} – Crusader Apr 12 '17 at 12:49
  • 1
    The readme on github is great. https://github.com/daviddrysdale/python-phonenumbers – Ryan Jan 19 '18 at 14:29
  • 3
    @ShaneReustle: your comment is as good as the answer **`8^D`** – Olivier Pons Oct 01 '19 at 13:58
36

Seems like your examples formatted with three digits groups except last, you can write a simple function, uses thousand seperator and adds last digit:

>>> def phone_format(n):                                                                                                                                  
...     return format(int(n[:-1]), ",").replace(",", "-") + n[-1]                                                                                                           
... 
>>> phone_format("5555555")
'555-5555'
>>> phone_format("5555555")
'555-5555'
>>> phone_format("5555555555")
'555-555-5555'
>>> phone_format("18005555555")
'1-800-555-5555'
utdemir
  • 26,532
  • 10
  • 62
  • 81
6

Here's one adapted from utdemir's solution and this solution that will work with Python 2.6, as the "," formatter is new in Python 2.7.

def phone_format(phone_number):
    clean_phone_number = re.sub('[^0-9]+', '', phone_number)
    formatted_phone_number = re.sub("(\d)(?=(\d{3})+(?!\d))", r"\1-", "%d" % int(clean_phone_number[:-1])) + clean_phone_number[-1]
    return formatted_phone_number
Community
  • 1
  • 1
Jon Mabe
  • 1,446
  • 1
  • 11
  • 12
2

More verbose, one dependency, but guarantees consistent output for most inputs and was fun to write:

import re

def format_tel(tel):
    tel = tel.removeprefix("+")
    tel = tel.removeprefix("1")     # remove leading +1 or 1
    tel = re.sub("[ ()-]", '', tel) # remove space, (), -

    assert(len(tel) == 10)
    tel = f"{tel[:3]}-{tel[3:6]}-{tel[6:]}"

    return tel

Output:

>>> format_tel("1-800-628-8737")
'800-628-8737'
>>> format_tel("800-628-8737")
'800-628-8737'
>>> format_tel("18006288737")
'800-628-8737'
>>> format_tel("1800-628-8737")
'800-628-8737'
>>> format_tel("(800) 628-8737")
'800-628-8737'
>>> format_tel("(800) 6288737")
'800-628-8737'
>>> format_tel("(800)6288737")
'800-628-8737'
>>> format_tel("8006288737")
'800-628-8737'

Without magic numbers; ...if you're not into the whole brevity thing:

def format_tel(tel):
    AREA_BOUNDARY = 3           # 800.6288737
    SUBSCRIBER_SPLIT = 6        # 800628.8737
    
    tel = tel.removeprefix("+")
    tel = tel.removeprefix("1")     # remove leading +1, or 1
    tel = re.sub("[ ()-]", '', tel) # remove space, (), -

    assert(len(tel) == 10)
    tel = (f"{tel[:AREA_BOUNDARY]}-"
           f"{tel[AREA_BOUNDARY:SUBSCRIBER_SPLIT]}-{tel[SUBSCRIBER_SPLIT:]}")

    return tel
young_souvlaki
  • 1,886
  • 4
  • 24
  • 28
1

You can use the function clean_phone() from the library DataPrep. Install it with pip install dataprep.

>>> from dataprep.clean import clean_phone
>>> df = pd.DataFrame({'phone': ['5555555', '5555555555', '18005555555']})
>>> clean_phone(df, 'phone')
Phone Number Cleaning Report:                                                   
    3 values cleaned (100.0%)
Result contains 3 (100.0%) values in the correct format and 0 null values (0.0%)
         phone     phone_clean
0      5555555        555-5555
1   5555555555    555-555-5555
2  18005555555  1-800-555-5555
victoria55
  • 225
  • 2
  • 6
0

A simple solution might be to start at the back and insert the hyphen after four numbers, then do groups of three until the beginning of the string is reached. I am not aware of a built in function or anything like that.

You might find this helpful: http://www.diveintopython3.net/regular-expressions.html#phonenumbers

Regular expressions will be useful if you are accepting user input of phone numbers. I would not use the exact approach followed at the above link. Something simpler, like just stripping out digits, is probably easier and just as good.

Also, inserting commas into numbers is an analogous problem that has been solved efficiently elsewhere and could be adapted to this problem.

ChrisP
  • 5,812
  • 1
  • 33
  • 36
0

In my case, I needed to get a phone pattern like "*** *** ***" by country.

So I re-used phonenumbers package in our project

from phonenumbers import country_code_for_region, format_number, PhoneMetadata, PhoneNumberFormat, parse as parse_phone
import re

def get_country_phone_pattern(country_code: str):
    mobile_number_example = PhoneMetadata.metadata_for_region(country_code).mobile.example_number
    formatted_phone = format_number(parse_phone(mobile_number_example, country_code), PhoneNumberFormat.INTERNATIONAL)
    without_country_code = " ".join(formatted_phone.split()[1:])
    return re.sub("\d", "*", without_country_code)

get_country_phone_pattern("KG")  # *** *** ***
Islam Murtazaev
  • 1,488
  • 2
  • 17
  • 27