Python - Splitting numbers and letters into sub-strings with regular expression

Question

I am creating a metric measurement converter. The user is expected to enter in an expression such as 125km (a number followed by a unit abbreviation). For conversion, the numerical value must be split from the abbreviation, producing a result such as [125, 'km']. I have done this with a regular expression, re.split, however it produces unwanted item in the resulting list:

import re
s = '125km'
print(re.split('(\d+)', s))

Output:

['', '125', 'km']

I do not need nor want the beginning ''. How can I simply separate the numerical part of the string from the alphabetical part to produce a list using a regular expression?

@paxdiablo: Sure, but not as simple for `m/s^2` (acceleration). — nhahtdh, Feb 03 '15 at 02:57
@Jacob: Unit for energy, `J` or `kg*(m^2)/(s^2)`, or `N*m`. It is also equivalent to `W*h`, which is used to measure electrical consumption (usually `kW*h`, kilowatt hour). — nhahtdh, Feb 03 '15 at 02:58

score 12 · Accepted Answer · answered Feb 03 '15 at 02:46

12

What's wrong with re.findall ?

>>> s = '125km'
>>> re.findall(r'[A-Za-z]+|\d+', s)
['125', 'km']

[A-Za-z]+ matches one or more alphabets. | or \d+ one or more digits.

OR

Use list comprehension.

>>> [i for i in re.split(r'([A-Za-z]+)', s) if i]
['125', 'km']
>>> [i for i in re.split(r'(\d+)', s) if i]
['125', 'km']

answered Feb 03 '15 at 02:46

Avinash Raj

172,303
28
230
274

1

What if the number had decimals, is there a way to handle that case? say 1.25km how can I get ['125', 'km'] – Kikanye Oct 11 '21 at 16:35
1

`re.findall(r'[A-Za-z]+|\d+(?:\.\d+)?', s)` – Avinash Raj Oct 12 '21 at 11:54

score 1 · Answer 2 · answered Sep 17 '20 at 11:21

Split a string into list of sub-string (number and others)

Using program:

s = "125km1234string"
sub = []
char = ""
num = ""
for letter in s:
    if letter.isdigit():
        if char:
            sub.append(char)
            char = ""
        num += letter
    else:
        if num:
            sub.append(num)
            num = ""
        char += letter
sub.append(char) if char else sub.append(num)
print(sub)

Output

['125', 'km', '1234', 'string']

Python - Splitting numbers and letters into sub-strings with regular expression

2 Answers2

Linked

Related