0

Possible Duplicate:
Python replace multiple strings

I am looking to replace “ “, “\r”, “\n”, “<”, “>”, “’” (single quote), and ‘”’ (double quote) with “” (empty). I’m also looking to replace “;” and “|” with “,”.

Would this be handled by re.search since I want to be able to search anywhere in the text, or would I use re.sub.

What would be the best way to handle this? I have found bits and pieces, but not where multiple regexes are handled.

Community
  • 1
  • 1
kamal
  • 9,637
  • 30
  • 101
  • 168

3 Answers3

4

If you want to remove all occurrences of those characters, just put them all in a character class and do re.sub()

your_str = re.sub(r'[ \r\n\'"]+', '', your_str)
your_str = re.sub(r'[;|]', ',', your_str)

You have to call re.sub() for every replacement rule.

NullUserException
  • 83,810
  • 28
  • 209
  • 234
  • so your_str would be an instance of the character class ? not sure what you mean by that "put them all in a character class – kamal Sep 26 '11 at 17:40
  • Read [this tutorial](http://www.regular-expressions.info/charclass.html) for more info. This is elementary regex stuff. – Tim Pietzcker Sep 26 '11 at 17:53
  • +1 for having thought to place '+' after the brackets (I had forgot) – eyquem Sep 26 '11 at 17:56
4

If you need to replace only single characters then you could use str.translate():

import string

table = string.maketrans(';|', ',,')
deletechars = ' \r\n<>\'"'

print "ex'a;m|ple\n".translate(table, deletechars)
# -> exa,m,ple
jfs
  • 399,953
  • 195
  • 994
  • 1,670
0
import re

reg = re.compile('([ \r\n\'"]+)|([;|]+)')

ss = 'bo ba\rbu\nbe\'bi"by-ja;ju|jo'

def repl(mat, di = {1:'',2:','}):
    return di[mat.lastindex]

print reg.sub(repl,ss)

Note: '|' loses its speciality between brackets

eyquem
  • 26,771
  • 7
  • 38
  • 46