We have a bunch of strings for example: c1309
, IF1306
, v1309
, p1209
, a1309
, mo1309
.
In Python, what is the best way to strip out the numbers? All I need is: c
, IF
, v
, p
, a
, mo
from above example.

- 244,495
- 58
- 464
- 504

- 736
- 2
- 8
- 17
-
14Why is this simple question upvoted so much o_O ? Also one could just search and use the "reverse" solution of this [one](http://stackoverflow.com/q/3062742). – HamZa May 31 '13 at 06:03
-
4@HamZa Simple questions are more likely to be upvoted because they can be easily and quickly observed by all users, including those not even familiar with the language. – jamylak May 31 '13 at 06:56
-
@jamylak sad enough, a little bit jealous to be honest ... – HamZa May 31 '13 at 07:49
-
5@HamZa this is nothing... http://stackoverflow.com/questions/931092/reverse-a-string-in-python/931095#931095 – jamylak May 31 '13 at 07:51
-
@jamylak hahahaha I shall learn python ! – HamZa May 31 '13 at 07:52
-
2@HamZa it's the bikeshed problem. – Mike G May 31 '13 at 14:25
8 Answers
You can use regex
:
>>> import re
>>> strs = "c1309, IF1306, v1309, p1209, a1309, mo1309"
>>> re.sub(r'\d','',strs)
'c, IF, v, p, a, mo'
or a faster version:
>>> re.sub(r'\d+','',strs)
'c, IF, v, p, a, mo'
timeit
comparisons:
>>> strs = "c1309, IF1306, v1309, p1209, a1309, mo1309"*10**5
>>> %timeit re.sub(r'\d','',strs)
1 loops, best of 3: 1.23 s per loop
>>> %timeit re.sub(r'\d+','',strs)
1 loops, best of 3: 480 ms per loop
>>> %timeit ''.join([c for c in strs if not c.isdigit()])
1 loops, best of 3: 1.07 s per loop
#winner
>>> %timeit from string import digits;strs.translate(None, digits)
10 loops, best of 3: 20.4 ms per loop

- 244,495
- 58
- 464
- 504
-
1Better use `re.sub(r'\d+','',strs)`, though, for increased efficiency. – Tim Pietzcker May 31 '13 at 09:09
-
1
-
@TimPietzcker If numbers are only decimal then does `re.sub(r'[0-9]+','',strs)` improve speed **??** – Grijesh Chauhan May 31 '13 at 12:36
-
@GrijeshChauhan: Probably not or at least not significantly, unless you compile the regex using `re.UNICODE`. – Tim Pietzcker May 31 '13 at 12:40
>>> text = 'mo1309'
>>> ''.join([c for c in text if not c.isdigit()])
'mo'
This is faster than regex
python -m timeit -s "import re; text = 'mo1309'" "re.sub(r'\d','',text)"
100000 loops, best of 3: 3.99 usec per loop
python -m timeit -s "import re; text = 'mo1309'" "''.join([c for c in text if not c.isdigit()])"
1000000 loops, best of 3: 1.42 usec per loop
python -m timeit -s "from string import digits; text = 'mo1309'" "text.translate(None, digits)"
1000000 loops, best of 3: 0.42 usec per loop
but str.translate
as suggested by @DavidSousa:
from string import digits
text.translate(None, digits)
is always the fastest in stripping characters.
Also itertools
supplies a little known function called ifilterfalse
>>> from itertools import ifilterfalse
>>> ''.join(ifilterfalse(str.isdigit, text))
'mo'
-
Is `join` with a list comprehension faster than `join` with a generator expression? – Blender May 31 '13 at 03:09
-
-
1
I think the string method translate
is more elegant than joining lists etc.
from string import digits # digits = '0123456789'
list1 = ['c1309', 'IF1306', 'v1309', 'p1209', 'a1309', 'mo1309']
list2 = [ i.translate(None, digits) for i in list1 ]

- 383
- 2
- 13
-
2`from string import digits` was better (not sure why you changed it). This is the fastest way and may be arguably more elegant in Python 2, but in Python 3 it looks like: `text.translate(str.maketrans('', '', digits))` – jamylak May 31 '13 at 03:17
-
1+1 but you could use list comprehension to make it even more elegant. – Jan Wrobel May 31 '13 at 20:27
-
I think this is the simplest, and will probably be the fastest too.
>>> import string
>>> s = 'c1309, IF1306, v1309, p1209, a1309, mo1309'
>>> s.translate(None, string.digits)
'c, IF, v, p, a, mo'
Note: interface of str.translate
was changed to use a mapping in python3, so here is the 3 version
s.translate({ord(n): None for n in string.digits})
Or a more explicit alternative:
m = str.maketrans('', '', string.digits)
s.translate(m)

- 338,267
- 99
- 616
- 750
strings = ['c1309', 'IF1306', 'v1309', 'p1209', 'a1309', 'mo1309']
stripped = [''.join(c for c in s if not c.isdigit()) for s in strings]

- 236
- 1
- 4
- 11
If all the strings you are dealing with end with a number you can, literally, strip
the number:
>>> strings = ['c1309', 'IF1306', 'v1309', 'p1209', 'a1309', 'mo1309']
>>> [s.strip("0123456789") for s in strings]
['c', 'IF', 'v', 'p', 'a', 'mo']
If you want to remove the digits only at the end of the string use rstrip
. If the digits may appear inside the string then this method wont work at all.

- 98,325
- 22
- 197
- 231
-
+1. This is probably all that the OP needs. You could also replace `0123456789` in your solution with `string.digits` – iruvar May 31 '13 at 17:54
use slice notation if the numbers length is fixed and position not in the middle of string.
NUM_LEN = 4
stringsWithDigit = ["ab1234", "cde1234", "fgh5678"]
for i in stringsWithDigit:
print i[:-NUM_LEN]
any thing else
import re
c = re.compile("[^0-9]+")
print c.findall("".join(stringsWithDigit))

- 1
- 1
- 1
You can try this regex:
^[a-zA-Z]+
It will just take consecutive alphabets from start
and neglect all the other stuff in string.
No replacement will be required.

- 11,077
- 4
- 35
- 56