Python strip() multiple characters?

Question

I want to remove any brackets from a string. Why doesn't this work properly?

>>> name = "Barack (of Washington)"
>>> name = name.strip("(){}<>")
>>> print name
Barack (of Washington

score 121 · Answer 1 · answered Oct 10 '10 at 11:17

Because that's not what strip() does. It removes leading and trailing characters that are present in the argument, but not those characters in the middle of the string.

You could do:

name= name.replace('(', '').replace(')', '').replace ...

or:

name= ''.join(c for c in name if c not in '(){}<>')

or maybe use a regex:

import re
name= re.sub('[(){}<>]', '', name)

JasonFruit · Accepted Answer · 2010-10-11T03:12:37.540

I did a time test here, using each method 100000 times in a loop. The results surprised me. (The results still surprise me after editing them in response to valid criticism in the comments.)

Here's the script:

import timeit

bad_chars = '(){}<>'

setup = """import re
import string
s = 'Barack (of Washington)'
bad_chars = '(){}<>'
rgx = re.compile('[%s]' % bad_chars)"""

timer = timeit.Timer('o = "".join(c for c in s if c not in bad_chars)', setup=setup)
print "List comprehension: ",  timer.timeit(100000)


timer = timeit.Timer("o= rgx.sub('', s)", setup=setup)
print "Regular expression: ", timer.timeit(100000)

timer = timeit.Timer('for c in bad_chars: s = s.replace(c, "")', setup=setup)
print "Replace in loop: ", timer.timeit(100000)

timer = timeit.Timer('s.translate(string.maketrans("", "", ), bad_chars)', setup=setup)
print "string.translate: ", timer.timeit(100000)

Here are the results:

List comprehension:  0.631745100021
Regular expression:  0.155561923981
Replace in loop:  0.235936164856
string.translate:  0.0965719223022

Results on other runs follow a similar pattern. If speed is not the primary concern, however, I still think string.translate is not the most readable; the other three are more obvious, though slower to varying degrees.

thanks for this - educative question, not only did I learn that strip() doesn't do what I thought, I also learned three other ways to achieve what I wanted, and which was the fastest! — AP257, Oct 11 '10 at 15:53
wont work for unicode: translate() only takes one argument (table) with unicode. — Rich Tier, Jul 16 '13 at 18:32
minus 1: For making something about speed which should be about code clarity and robustness. — jwg, Jan 05 '16 at 10:52
@jwg , you're totally right; looking back at this, it's an exercise in high-visibility missing of the point. Somehow, though, it keeps getting upvotes. (I did learn some interesting things about using timeit, though.) — JasonFruit, Aug 21 '17 at 11:57

score 17 · Answer 3 · answered May 07 '14 at 09:52

17

string.translate with table=None works fine.

>>> name = "Barack (of Washington)"
>>> name = name.translate(None, "(){}<>")
>>> print name
Barack of Washington

answered May 07 '14 at 09:52

cherish

1,370
1
11
16

20

This doesn't work in Python 3 for strings, only for bytes and bytearray. – Mark Lawrence Feb 26 '18 at 20:40

score 16 · Answer 4 · answered Oct 10 '10 at 11:20

16

Because strip() only strips trailing and leading characters, based on what you provided. I suggest:

>>> import re
>>> name = "Barack (of Washington)"
>>> name = re.sub('[\(\)\{\}<>]', '', name)
>>> print(name)
Barack of Washington

answered Oct 10 '10 at 11:20

Ruel

15,438
7
38
49

6

In a regex character class you don't need to escape anything, so '[(){}<>]' is fine – Mike Axiak Oct 10 '10 at 17:16

score 10 · Answer 5 · answered Oct 10 '10 at 11:15

strip only strips characters from the very front and back of the string.

To delete a list of characters, you could use the string's translate method:

import string
name = "Barack (of Washington)"
table = string.maketrans( '', '', )
print name.translate(table,"(){}<>")
# Barack of Washington

score 0 · Answer 6 · answered Feb 07 '23 at 17:26

Since strip only removes characters from start and end, one idea could be to break the string into list of words, then remove chars, and then join:

s = 'Barack (of Washington)'
x = [j.strip('(){}<>') for j in s.split()]
ans = ' '.join(j for j in x)
print(ans)

score -4 · Answer 7 · edited Oct 07 '17 at 12:53

-4

For example string s="(U+007c)"

To remove only the parentheses from s, try the below one:

import re
a=re.sub("\\(","",s)
b=re.sub("\\)","",a)
print(b)

edited Oct 07 '17 at 12:53

alexander.polomodov

5,396
14
39
46

answered Aug 06 '17 at 11:35

Siva Kumar

459
4
6

How does that remove parenthesis? By removing anything that isn't alphanumeric? – Jeff Schaller Aug 06 '17 at 11:52
When the question says "remove parenthesis" but your answer says "removing anything that's not alphanumeric", I don't think you're addressing the question. – Jeff Schaller Aug 06 '17 at 11:57

Python strip() multiple characters?

7 Answers7

Linked

Related