0

I am trying to use re.split in python. I want to remove all these characters like " , ; < > { } [ ] / \ ? ! .I am trying to do something like this-

re.split("[, \_!?,;:-]+", word)

How can I add characters like " ( ) or < > ' so that they can also be removed?

Edit

re.split('\W+',word)

This works fine but it is not removing underscore symbol. How can I also remove underscore?

DilithiumMatrix
  • 17,795
  • 22
  • 77
  • 119
Noober
  • 1,516
  • 4
  • 22
  • 48

2 Answers2

2

checkout the str.translate function for example in python 2.6+

line = line.translate(None, " ?.!/;:")

or in python 3+

line = line.translate(" ?.!/;:")

see Remove specific characters from a string in python

Community
  • 1
  • 1
pwilmot
  • 586
  • 2
  • 8
2

Try:

re.split('\W+|\_', word)

Also just remove them:

re.sub('\W+|\_', '', word)

Take a look at the document for more details.

Remi Guan
  • 21,506
  • 17
  • 64
  • 87