How to split a string into numbers and characters

Question

string = 'Hello, welcome to my world001'

This is the string which I have been trying to split into two (number and string). The numbers from the back needs to be splitted from the original string. Is there any method in python that can do this real quick.

Here's my try(code):

length_of_string = len(string) - 1
num = []
if string[-1].isdigit():
    while length_of_string > 0:
          if string[length_of_string].isdigit():
              num.insert(0, int(string[length_of_string]))
          length_of_string -= 1
      print(num)
else:
    string += '1'
    print(string)

score 6 · Accepted Answer · answered Jul 12 '21 at 11:45

A regex find all approach might be appropriate here. We can find groups of all non digit or all digit characters, alternatively.

string = 'Hello, welcome to my world001'
parts = re.findall(r'\D+|\d+', string)
print(parts)  # ['Hello, welcome to my world', '001']

score 3 · Answer 2 · edited Jul 12 '21 at 14:35

3

You can use itertools.groupby with your str.isdigit test as grouping function

from itertools import groupby

parts = ["".join(g) for _, g in groupby(string, key=str.isdigit)]

edited Jul 12 '21 at 14:35

Dalen

4,128
1
17
35

answered Jul 12 '21 at 11:46

user2390182

72,016
6
67
89

Zazaeil · Answer 3 · 2021-07-12T15:03:07.630

2

Not the quickest, but still pretty pythonic way:

def chars_and_nums(text):
    if not text:
        return iter(), iter()
    return filter(str.isdigit, text), filter(str.isalpha, text)

More efficiently:

def chars_and_nums_efficient(text):
    if not text:
        return [], []
    digits, chars = [], []
    for c in text:
        if c.isdigit():
            digits.append(c)
        elif c.isalpha():
            chars.append(c)
    return digits, chars

edited Jul 12 '21 at 15:03

answered Jul 12 '21 at 11:48

Zazaeil

3,900
2
14
31

+1 for creativity, but you have a bug. You need to use e.g. lambda x: (x.isalpha() or x.isspace()), or lambda x: (not x.isdigit()), otherwise whitespaces will be removed from the result. – Dalen Jul 12 '21 at 13:19
@Dalen I am aware of that behavior. Somewhy I though only alphabet and digits should stay. – Zazaeil Jul 12 '21 at 13:44

dawg · Answer 4 · 2021-07-12T12:01:25.073

You can use a regex:

import re 
string = 'Hello, welcome to my world001'
m=re.search(r'^(.*?)(\d+)$', string)

>>> m.groups()
('Hello, welcome to my world', '001')

Or use a regex split:

>>> re.split(r'(?<=\D)(?=\d+$)', string)
['Hello, welcome to my world', '001']

Alternatively, you can loop over the string in pairs and break when the first digit is seen to perform a split:

for i,(c1,c2) in enumerate(zip(string, string[1:]),1):
    if c2.isdigit(): break

s1,s2=(string[0:i],string[i:])

>>> (s1,s2)
('Hello, welcome to my world', '001')

Dalen · Answer 5 · 2021-07-12T13:29:17.203


string = "Hello world001"
end = []
last_change = string[-1].isdigit()
temp = []
for x in range(len(string)-1, -1, -1):
    char = string[x]
    changed = char.isdigit()
    if changed==last_change:
        temp.append(char)
        continue
    temp.reverse()
    end.append("".join(temp))
    temp = [char]
    last_change = changed

if temp:
    temp.reverse()
    end.append("".join(temp))

end.reverse()
print(end)

This code wil split your string in chunks of letters and numbers but grouped together. I made it walk backwards through the 'string' so that if you need to differentiate between only last few digits and the rest, you can break the loop after first detected change. Then your string part will be:

string_part = string[:x]

and the number:

number = end[-1]
# or "".join(temp) if you do not overwrite it

You can make it walk through the string forwards and remove all .reverse(), and also, discard range() and use:

for char in string: ...

This code is fast, no worries about that, but if you insist on even less code, then you can use regular expressions to do it. In module "re" there is re.split() function that is like normal str.split(), but splits by regexp pattern instead by the fixed substring. However, it is quite possible that for your purposes my code will be faster. re.split() will have to do much more checking along the way. Regexps are a terrific tool to have, if you know how to use them, but they aren't always the fastest solution there is.

I very much like the @schwobaseggl's answer. I would use the itertools solution if I were you. The groupby() function essentially works the same as the code I posted.

How to split a string into numbers and characters

5 Answers5