Strip out numbers from a string

Question

We have a bunch of strings for example: c1309, IF1306, v1309, p1209, a1309, mo1309.
In Python, what is the best way to strip out the numbers? All I need is: c, IF, v, p, a, mo from above example.

Why is this simple question upvoted so much o_O ? Also one could just search and use the "reverse" solution of this [one](http://stackoverflow.com/q/3062742). — HamZa, May 31 '13 at 06:03
@HamZa Simple questions are more likely to be upvoted because they can be easily and quickly observed by all users, including those not even familiar with the language. — jamylak, May 31 '13 at 06:56
@HamZa this is nothing... http://stackoverflow.com/questions/931092/reverse-a-string-in-python/931095#931095 — jamylak, May 31 '13 at 07:51

Ashwini Chaudhary · Answer 1 · 2013-05-31T09:25:24.190

27

You can use regex:

>>> import re
>>> strs = "c1309, IF1306, v1309, p1209, a1309, mo1309"
>>> re.sub(r'\d','',strs)
'c, IF, v, p, a, mo'

or a faster version:

>>> re.sub(r'\d+','',strs)
'c, IF, v, p, a, mo'

timeit comparisons:

>>> strs = "c1309, IF1306, v1309, p1209, a1309, mo1309"*10**5

>>> %timeit re.sub(r'\d','',strs)
1 loops, best of 3: 1.23 s per loop

>>> %timeit re.sub(r'\d+','',strs)
1 loops, best of 3: 480 ms per loop

>>> %timeit ''.join([c for c in strs if not c.isdigit()])
1 loops, best of 3: 1.07 s per loop

#winner
>>> %timeit from string import digits;strs.translate(None, digits)
10 loops, best of 3: 20.4 ms per loop

edited May 31 '13 at 09:25

answered May 31 '13 at 03:02

Ashwini Chaudhary

244,495
58
464
504

1

Better use `re.sub(r'\d+','',strs)`, though, for increased efficiency. – Tim Pietzcker May 31 '13 at 09:09
1

@TimPietzcker thanks, never knew about that. – Ashwini Chaudhary May 31 '13 at 09:26
@TimPietzcker If numbers are only decimal then does `re.sub(r'[0-9]+','',strs)` improve speed **??** – Grijesh Chauhan May 31 '13 at 12:36
@GrijeshChauhan: Probably not or at least not significantly, unless you compile the regex using `re.UNICODE`. – Tim Pietzcker May 31 '13 at 12:40

score 22 · Answer 2 · edited May 23 '17 at 11:51

22

>>> text = 'mo1309'
>>> ''.join([c for c in text if not c.isdigit()])
'mo'

This is faster than regex

python -m timeit -s "import re; text = 'mo1309'" "re.sub(r'\d','',text)"
100000 loops, best of 3: 3.99 usec per loop
python -m timeit -s "import re; text = 'mo1309'" "''.join([c for c in text if not c.isdigit()])"
1000000 loops, best of 3: 1.42 usec per loop
python -m timeit -s "from string import digits; text = 'mo1309'" "text.translate(None, digits)"
1000000 loops, best of 3: 0.42 usec per loop

but str.translate as suggested by @DavidSousa:

from string import digits
text.translate(None, digits)

is always the fastest in stripping characters.

Also itertools supplies a little known function called ifilterfalse

>>> from itertools import ifilterfalse
>>> ''.join(ifilterfalse(str.isdigit, text))
'mo'

edited May 23 '17 at 11:51

Community

1
1

answered May 31 '13 at 03:04

jamylak

128,818
30
231
230

Is `join` with a list comprehension faster than `join` with a generator expression? – Blender May 31 '13 at 03:09
For large strings they are almost equivalent. – Ashwini Chaudhary May 31 '13 at 03:09
1

@Blender http://stackoverflow.com/a/9061024/846892 – Ashwini Chaudhary May 31 '13 at 03:12

David Sousa · Answer 3 · 2013-06-01T14:19:59.270

13

I think the string method translate is more elegant than joining lists etc.

from string import digits # digits = '0123456789'
list1 = ['c1309', 'IF1306', 'v1309', 'p1209', 'a1309', 'mo1309']
list2 = [ i.translate(None, digits) for i in list1 ]

edited Jun 01 '13 at 14:19

answered May 31 '13 at 03:08

David Sousa

383
2
13

2

`from string import digits` was better (not sure why you changed it). This is the fastest way and may be arguably more elegant in Python 2, but in Python 3 it looks like: `text.translate(str.maketrans('', '', digits))` – jamylak May 31 '13 at 03:17
1

+1 but you could use list comprehension to make it even more elegant. – Jan Wrobel May 31 '13 at 20:27
@jamylak I changed it just to look more clear. – David Sousa Jun 01 '13 at 14:19

wim · Answer 4 · 2013-05-31T03:32:50.447

I think this is the simplest, and will probably be the fastest too.

>>> import string
>>> s = 'c1309, IF1306, v1309, p1209, a1309, mo1309'
>>> s.translate(None, string.digits)
'c, IF, v, p, a, mo'

Note: interface of str.translate was changed to use a mapping in python3, so here is the 3 version

s.translate({ord(n): None for n in string.digits})

Or a more explicit alternative:

m = str.maketrans('', '', string.digits)
s.translate(m)

score 1 · Answer 5 · answered May 31 '13 at 03:10

1

strings = ['c1309', 'IF1306', 'v1309', 'p1209', 'a1309', 'mo1309']
stripped = [''.join(c for c in s if not c.isdigit()) for s in strings]

answered May 31 '13 at 03:10

fahrbach

236
1
4
11

score 1 · Answer 6 · answered May 31 '13 at 10:37

1

If all the strings you are dealing with end with a number you can, literally, strip the number:

>>> strings = ['c1309', 'IF1306', 'v1309', 'p1209', 'a1309', 'mo1309']
>>> [s.strip("0123456789") for s in strings]
['c', 'IF', 'v', 'p', 'a', 'mo']

If you want to remove the digits only at the end of the string use rstrip. If the digits may appear inside the string then this method wont work at all.

answered May 31 '13 at 10:37

Bakuriu

98,325
22
197
231

+1. This is probably all that the OP needs. You could also replace `0123456789` in your solution with `string.digits` – iruvar May 31 '13 at 17:54

score 0 · Answer 7 · answered May 31 '13 at 03:37

use slice notation if the numbers length is fixed and position not in the middle of string.

NUM_LEN = 4
stringsWithDigit = ["ab1234", "cde1234", "fgh5678"]
for i in stringsWithDigit:
   print i[:-NUM_LEN]

any thing else

import re
c = re.compile("[^0-9]+")
print c.findall("".join(stringsWithDigit))

score 0 · Answer 8 · answered May 31 '13 at 04:56

0

You can try this regex:

^[a-zA-Z]+

It will just take consecutive alphabets from start and neglect all the other stuff in string.

No replacement will be required.

answered May 31 '13 at 04:56

NeverHopeless

11,077
4
35
56

Strip out numbers from a string

8 Answers8