I am trying to put together a regex expression that removes all occurring redundant words.
re.sub(r'\b([a-zA-Z0-9_]{1,})(\s{1,}\1\b)+', '\\1', "removing redundant redundant redundant")
>>> 'removing redundant'
Now I am stuck at the level where the redundant words start with special characters
re.sub(r'\b([#a-zA-Z0-9_]{1,})(\s{1,}\1\b)+', '\\1', "removing red#undant red#undant red#undant")
>>> 'removing red#undant'
re.sub(r'\b([#a-zA-Z0-9_]{1,})(\s{1,}\1\b)+', '\\1', "removing #redundant #redundant #redundant")
>>> 'removing #redundant #redundant #redundant'
I tried putting backslashes and many other combinations but nothing worked. I am sure there is something fundamental that i am not understanding or missing ...
thanks