2

I want to write a function that converts the given string T and group them into three blocks. However, I want to split the last block into two if it can't be broken down to three numbers. For example, this is my code

import re

def num_format(T):
    clean_number = re.sub('[^0-9]+', '', T)
    formatted_number = re.sub(r"(\d{3})(?=(\d{3})+(?!\d{3}))", r"\1-", clean_number) 
    return formatted_number


num_format("05553--70002654")

this returns : '055-537-000-2654' as a result. However, I want it to be '055-537-000-26-54'. I used the regular expression, but have no idea how to split the last remaining numbers into two blocks!

I would really appreciate helping me to figure this problem out!! Thanks in advance.

dkdlfls26
  • 171
  • 1
  • 9

1 Answers1

1

You can use

def num_format(T):
    clean_number = ''.join(c for c in T if c.isdigit())
    return re.sub(r'(\d{3})(?=\d{2})|(?<=\d{2})(?=\d{2}$)', r'\1-', clean_number)

See the regex demo.

Note you can get rid of all non-numeric chars using plain Python comprehension, the solution is borrowed from Removing all non-numeric characters from string in Python.

The regex matches

  • (\d{3}) - Group 1 (\1): three digits...
  • (?=\d{2}) - followed with two digits
  • | - or
  • (?<=\d{2})(?=\d{2}$) - a location between any two digit sequence and two digits that are at the end of string.

See the Python demo:

import re
 
def num_format(T):
    clean_number = ''.join(c for c in T if c.isdigit())
    return re.sub(r'(\d{3})(?=\d{2})|(?<=\d{2})(?=\d{2}$)', r'\1-', clean_number)
 
print(num_format("05553--70002654"))
# => 055-537-000-26-54
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563