-3

I have one long string "M. tuberculosis H37Rv|Rv0153c|ptbB

out put should look like this:

"370153"

or

"M.tuberculosisHRv|Rvc|ptbB<b"

thanks

Erik
  • 935
  • 11
  • 28
user2935002
  • 79
  • 1
  • 10
  • 1
    You could use a [regular expression](http://docs.python.org/3/library/re.html). How did you try it? – Matthias Dec 17 '13 at 10:34

2 Answers2

1

You can use re.sub:

>>> import re
>>> re.sub(r'[0-9]', '', 'M. tuberculosis H37Rv|Rv0153c|ptbB')
'M. tuberculosis HRv|Rvc|ptbB'
>>> re.sub(r'[^0-9]', '', 'M. tuberculosis H37Rv|Rv0153c|ptbB')
'370153'
phihag
  • 278,196
  • 72
  • 453
  • 469
0

Try this:

In [4]: ''.join([x for x in s if x.isdigit()])
Out[4]: '370153'

and the fastest way is:

In [4]: ''.join((x for x in s if x.isdigit()))
Out[4]: '370153'
greg
  • 1,417
  • 9
  • 28
  • Why are you using a list comprehension instead of a generator expression? I think the latter is much more appropriate here, almost certainly faster, and shorter to boot. – phihag Dec 17 '13 at 10:39
  • No, I mean [generator expressions](http://docs.python.org/dev/reference/expressions.html#generator-expressions). You can simply drop the brackets to make your program use them. This means that Python won't senselessly allocate a list for your results, iterate once over it, and then throw it away. – phihag Dec 17 '13 at 10:44
  • 1
    @phihag A generator expression is slower using Python 2.7 according to `python -m timeit "''.join(ch for ch in 'M. tuberculosis H37Rv|Rv0153c|ptbB' if ch.isdigit())"` – Frerich Raabe Dec 17 '13 at 10:47
  • You are right, generator much faster. Thank you! – greg Dec 17 '13 at 10:50
  • I found that using list comprehension took 80% of the time that a generator expression took, at least in this example. I'm not sure how the timing would change for strings of increasing length though? – Ffisegydd Dec 17 '13 at 10:50
  • @FrerichRaabe You're right, they are not faster. In my tests, both variants seem to have about the same runtime. In any case, the main argument for generator expressions is simplicity and showing the intent of the code. – phihag Dec 17 '13 at 10:56
  • @phihag You may find that something based on `filter` is even faster, e.g. maybe `python -m timeit -s "import operator; f = operator.methodcaller('isdigit'); s = 'M. tuberculosis H37Rv|Rv0153c|ptbB'" "filter(f, s)"` is even more efficient for you. – Frerich Raabe Dec 17 '13 at 10:58
  • Sadly enough, in Python 2.x a solution based on `translate` seems to be by far the fastest for me, try, `python -m timeit -s "import string; allChars = string.maketrans('', ''); allButDigits = allChars.translate(allChars, string.digits)" "'M. tuberculosis H37Rv|Rv0153c|ptbB'.translate(allChars, allButDigits)"`. – Frerich Raabe Dec 17 '13 at 11:09