1

How to create a RE in Python to add whitespace in front of these special characters ,?! if these special characters stick to a word?

Here is the input string:

myString= 'I like him, but is he good? Maybe he is good , smart, and strong.'

Desired output (if the special character doesn't stick to a word, it is not modified):

modifiedString= 'I like him , but is he good ? Maybe he is good , smart , and strong.'

I have tried this re:

modifiedString= re.sub('\w,' , ' ,' ,myString)

But it gives the wrong result. It removes the last character before coma, here's the result example:

modifiedString = 'I like hi , but is he good? Maybe he is goo , smar , and strong.'

Any suggestion to solve this problem?

SalacceoVanz
  • 41
  • 1
  • 8

3 Answers3

5

You can use re.sub:

>>> import re
>>> myString= 'I like him, but is he good? Maybe he is good , smart, and strong.'
>>> re.sub('(?<=\w)([!?,])', r' \1', myString)
'I like him , but is he good ? Maybe he is good , smart , and strong.'
>>>

(?<=\w) is a lookback assertion that matches a word character.

([!?,]) is a capture group that matches the character set [!?,] (you can add any more characters that you want to match inside the square brackets).

\1 refers to the text captured by ([!?,]).

  • Great explanation..It works, thanks! I wonder how the \1 works. Does it return all of the matched pattern , just like $ symbol in C# regex in http://stackoverflow.com/questions/24722423/c-sharp-regex-replace-with-itself ? – SalacceoVanz Nov 07 '14 at 16:39
  • @SalacceoVanz - Somewhat like that. The numbers `\1`, `\2`, etc. refer to the text matched by the various capture groups in the Regex pattern. My pattern only has one capture group, `([!?,])`, so whatever is matched by this will be accessible through `\1`. –  Nov 07 '14 at 16:42
0

The thing is you are replacing character too. You need to preserve it using group in regex, and then specify group number in replacement string.

>>> myString = 'I like him, but is he good? Maybe he is good , smart, and strong.'
>>> re.sub(r'(\w)([,?!])' , r'\1 \2' ,myString)
'I like him , but is he good ? Maybe he is good , smart , and strong.'
Irshad Bhat
  • 8,479
  • 1
  • 26
  • 36
0

As an alternative answer , you can figure it out without regex , just use str.replace :

>>> rep_list=['?',',','!']
>>> for i in rep_list : 
...  if i in myString:
...   myString=myString.replace(i,' '+i)
... 
>>> myString
'I like him , but is he good ? Maybe he is good  , smart , and strong.'
Mazdak
  • 105,000
  • 18
  • 159
  • 188