I want to create a regex pattern which finds whitespaces and ignore hyphen seperated words.
The basic rule is to find any subsequent whitespaces([\s]+
), and do not find whitespaces where the pattern is:
[\S]+-[\s]+[\S]+
(The pattern of which i don't want to match the whitespaces)
Any other whitespaces should match.
Matched intervals should include whitespaces only, not other characters.
For example:
abc abc
should match at position 3-4.
abc
def
should match from the end of abc to start of def.
abc-
def
should not match.
abc -
def
should match at 3-4, 5-6.
The searched string is multiline and has many occurences of whitespaces, and i want to find them all in a single search.
Tried many different patterns (with negative lookahead and lookbehind) but none was able to apply for all cases.
Using python builtin re
module.
It is possible to do in two searches:
search for all occurences of
[\s]+
search for all occurences of
[\S]+-([\s]+)[\S]+
remove matches of the group in (2) from matches in (1)
Is it possible to do in a single search?