Remove whitespace after certain set of characters

Question

For example, I want to change the following string

strr = 'Hello, this is a test to remove whitespace.'

To

'Hello,this is a testto removewhitespace.'

So the whitespace directly after a comma, 't' or 'e' character should be removed. I tried something like:

re.sub(', |t |e ', ' ', strr)

However, this removes the comma, t and e as well. Afterwards, I am trying to split the string on the remaining whitespaces. My first approach was to split like this

re.split(' is |a |test|remove', strr)

However, this removes the delimiters as well, which is not what I want to achieve. So basically, I want to provide a list of characters followed by whitespace, such that the whitespace in that substring is removed.

Use `re.sub(r'([,te]) ', r'\1', strr)` or `re.sub(r'([,te])\s+', r'\1', strr)` — Wiktor Stribiżew, Oct 15 '18 at 20:13

score 2 · Accepted Answer · answered Oct 15 '18 at 20:14

Something like:

import re

str1 = 'Hello, this is a test to remove whitespace.'

str2 = re.sub(r'([te,])\s+', r'\1', str1)

print(str2)

Should work, where you're matching (and capturing) a known group, followed by any amount of whitespace, and replacing that whole thing with just what you've captured.

score 0 · Answer 2 · answered Oct 15 '18 at 20:13

You can use positive lookbehind [regex-tutorial] for this:

re.sub('(?<[,te]) ', '', strr)

This positive lookbehind (?< ...) block will look for a match, but it will not be part of the match, so you do not "eat" the characters when you repace it.

Note that the second parameter, should be the empty string (so '', not ' '), since otherwise you "reintroduce" the space.

This then yields:

>>> re.sub('(?<=[,te]) ', '', strr)
'Hello,this is a testto removewhitespace.'

In case you want to remove an arbitrary number (so one or more) spacing characters (spaces, new lines, etc.), you can use the \s+ instead:

>>> re.sub('(?<=[,te])\s+', '', strr)
'Hello,this is a testto removewhitespace.'

Remove whitespace after certain set of characters

2 Answers2