3

I am creating a metric measurement converter. The user is expected to enter in an expression such as 125km (a number followed by a unit abbreviation). For conversion, the numerical value must be split from the abbreviation, producing a result such as [125, 'km']. I have done this with a regular expression, re.split, however it produces unwanted item in the resulting list:

import re
s = '125km'
print(re.split('(\d+)', s))

Output:

['', '125', 'km']

I do not need nor want the beginning ''. How can I simply separate the numerical part of the string from the alphabetical part to produce a list using a regular expression?

Jacob
  • 268
  • 1
  • 7
  • 20

2 Answers2

12

What's wrong with re.findall ?

>>> s = '125km'
>>> re.findall(r'[A-Za-z]+|\d+', s)
['125', 'km']

[A-Za-z]+ matches one or more alphabets. | or \d+ one or more digits.

OR

Use list comprehension.

>>> [i for i in re.split(r'([A-Za-z]+)', s) if i]
['125', 'km']
>>> [i for i in re.split(r'(\d+)', s) if i]
['125', 'km']
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
1

Split a string into list of sub-string (number and others)

Using program:

s = "125km1234string"
sub = []
char = ""
num = ""
for letter in s:
    if letter.isdigit():
        if char:
            sub.append(char)
            char = ""
        num += letter
    else:
        if num:
            sub.append(num)
            num = ""
        char += letter
sub.append(char) if char else sub.append(num)
print(sub)

Output

['125', 'km', '1234', 'string']