16

We have a bunch of strings for example: c1309, IF1306, v1309, p1209, a1309, mo1309.
In Python, what is the best way to strip out the numbers? All I need is: c, IF, v, p, a, mo from above example.

Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
Can Lu
  • 736
  • 2
  • 8
  • 17
  • 14
    Why is this simple question upvoted so much o_O ? Also one could just search and use the "reverse" solution of this [one](http://stackoverflow.com/q/3062742). – HamZa May 31 '13 at 06:03
  • 4
    @HamZa Simple questions are more likely to be upvoted because they can be easily and quickly observed by all users, including those not even familiar with the language. – jamylak May 31 '13 at 06:56
  • @jamylak sad enough, a little bit jealous to be honest ... – HamZa May 31 '13 at 07:49
  • 5
    @HamZa this is nothing... http://stackoverflow.com/questions/931092/reverse-a-string-in-python/931095#931095 – jamylak May 31 '13 at 07:51
  • @jamylak hahahaha I shall learn python ! – HamZa May 31 '13 at 07:52
  • 2
    @HamZa it's the bikeshed problem. – Mike G May 31 '13 at 14:25

8 Answers8

27

You can use regex:

>>> import re
>>> strs = "c1309, IF1306, v1309, p1209, a1309, mo1309"
>>> re.sub(r'\d','',strs)
'c, IF, v, p, a, mo'

or a faster version:

>>> re.sub(r'\d+','',strs)
'c, IF, v, p, a, mo'

timeit comparisons:

>>> strs = "c1309, IF1306, v1309, p1209, a1309, mo1309"*10**5

>>> %timeit re.sub(r'\d','',strs)
1 loops, best of 3: 1.23 s per loop

>>> %timeit re.sub(r'\d+','',strs)
1 loops, best of 3: 480 ms per loop

>>> %timeit ''.join([c for c in strs if not c.isdigit()])
1 loops, best of 3: 1.07 s per loop

#winner
>>> %timeit from string import digits;strs.translate(None, digits)
10 loops, best of 3: 20.4 ms per loop
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
22
>>> text = 'mo1309'
>>> ''.join([c for c in text if not c.isdigit()])
'mo'

This is faster than regex

python -m timeit -s "import re; text = 'mo1309'" "re.sub(r'\d','',text)"
100000 loops, best of 3: 3.99 usec per loop
python -m timeit -s "import re; text = 'mo1309'" "''.join([c for c in text if not c.isdigit()])"
1000000 loops, best of 3: 1.42 usec per loop
python -m timeit -s "from string import digits; text = 'mo1309'" "text.translate(None, digits)"
1000000 loops, best of 3: 0.42 usec per loop

but str.translate as suggested by @DavidSousa:

from string import digits
text.translate(None, digits)

is always the fastest in stripping characters.

Also itertools supplies a little known function called ifilterfalse

>>> from itertools import ifilterfalse
>>> ''.join(ifilterfalse(str.isdigit, text))
'mo'
Community
  • 1
  • 1
jamylak
  • 128,818
  • 30
  • 231
  • 230
13

I think the string method translate is more elegant than joining lists etc.

from string import digits # digits = '0123456789'
list1 = ['c1309', 'IF1306', 'v1309', 'p1209', 'a1309', 'mo1309']
list2 = [ i.translate(None, digits) for i in list1 ]
David Sousa
  • 383
  • 2
  • 13
  • 2
    `from string import digits` was better (not sure why you changed it). This is the fastest way and may be arguably more elegant in Python 2, but in Python 3 it looks like: `text.translate(str.maketrans('', '', digits))` – jamylak May 31 '13 at 03:17
  • 1
    +1 but you could use list comprehension to make it even more elegant. – Jan Wrobel May 31 '13 at 20:27
  • @jamylak I changed it just to look more clear. – David Sousa Jun 01 '13 at 14:19
3

I think this is the simplest, and will probably be the fastest too.

>>> import string
>>> s = 'c1309, IF1306, v1309, p1209, a1309, mo1309'
>>> s.translate(None, string.digits)
'c, IF, v, p, a, mo'

Note: interface of str.translate was changed to use a mapping in python3, so here is the 3 version

s.translate({ord(n): None for n in string.digits})

Or a more explicit alternative:

m = str.maketrans('', '', string.digits)
s.translate(m)
wim
  • 338,267
  • 99
  • 616
  • 750
1
strings = ['c1309', 'IF1306', 'v1309', 'p1209', 'a1309', 'mo1309']
stripped = [''.join(c for c in s if not c.isdigit()) for s in strings]
fahrbach
  • 236
  • 1
  • 4
  • 11
1

If all the strings you are dealing with end with a number you can, literally, strip the number:

>>> strings = ['c1309', 'IF1306', 'v1309', 'p1209', 'a1309', 'mo1309']
>>> [s.strip("0123456789") for s in strings]
['c', 'IF', 'v', 'p', 'a', 'mo']

If you want to remove the digits only at the end of the string use rstrip. If the digits may appear inside the string then this method wont work at all.

Bakuriu
  • 98,325
  • 22
  • 197
  • 231
  • +1. This is probably all that the OP needs. You could also replace `0123456789` in your solution with `string.digits` – iruvar May 31 '13 at 17:54
0

use slice notation if the numbers length is fixed and position not in the middle of string.

NUM_LEN = 4
stringsWithDigit = ["ab1234", "cde1234", "fgh5678"]
for i in stringsWithDigit:
   print i[:-NUM_LEN]

any thing else

import re
c = re.compile("[^0-9]+")
print c.findall("".join(stringsWithDigit))
BAlfa
  • 1
  • 1
  • 1
0

You can try this regex:

^[a-zA-Z]+

It will just take consecutive alphabets from start and neglect all the other stuff in string.

No replacement will be required.

NeverHopeless
  • 11,077
  • 4
  • 35
  • 56