Every topic I've read combining Python's Regex (re library) and Inverse/Negative matching has focused on multiline strings as opposed to SINGLE line strings.
Beyond the fact that http://www.regextester.com/15 uses a JavaScript regex library displaying matches for the entire group (/g) and behaves differently from Python's re library (apparently according to https://rexegg.com/ there's another regex library in Python which I don't wish to use just yet), I wanted to know if there was a way to use "re.findall" (and yes re.search although I'm privy to re.findall) to do 2 things: 1. Return all individual strings that do not contain the string "hede" in qw below. 2. Return all individual strings that do not contain the string "hede" and break strings containing the string "hede" on either side.
>>> qw = "hoho hihi haha hede rara a rere titi so whdhdskhdshede wekjewhkwqjhededjfjfj so kjkfdjkdnekjdhide b hede kdjkdld"
Scenario 1 Desired Output (exclude all strings that contain "hede"):
>>> qw ='hoho hihi haha hede rara a rere titi so whdhdskhdshede wekjewhkwqjhededjfjfj so kjkfdjkdnekjdhide b hede kdjkdld'
>>> re.findall('{SOMETHING_THAT_EXCLUDES_ALL_STRINGS_COTAINING_hede}', qw)
['hoho', 'hihi', 'haha', 'rara', 'a', 'rere', 'titi', 'so', 'so', 'kjkfdjkdnekjdhide', 'b', 'kdjkdld']
Scenario 2 Desired Output (include everything that doesn't contain "hede" and break strings contaiinig "hede" at "hede"):
>>> qw ='hoho hihi haha hede rara a rere titi so whdhdskhdshede wekjewhkwqjhededjfjfj so kjkfdjkdnekjdhide b hede kdjkdld'
>>> re.findall('{SOMETHING_THAT_INCLUDES_ALL_STRINGS_NOT_COTAINING_hede_AND_BREAKS_THEM_IF_THEY_DO}', qw)
['hoho', 'hihi', 'haha', 'rara', 'a', 'rere', 'titi', 'so', 'whdhdskhds', 'wekjewhkwqj', 'djfjfj', 'so' 'kjkfdjkdnekjdhide', 'b', 'kdjkdld']
Closest I've come is so inefficient:
>>> qw ='hoho hihi haha hede rara a rere titi so whdhdskhdshede wekjewhkwqjhededjfjfj so kjkfdjkdnekjdhide b hede kdjkdld'
>>> re.findall('[\S]+(?=hede)|(?<=hede )[\S]+|(?<=hede)[\S]+|[\S]+(?= hede)|[\S]+(?=hede )|(?<= hede)[\S]+', qw)
['haha', 'rara', 'whdhdskhds', 'wekjewhkwqj', 'djfjfj', 'b', 'kdjkdld']
Keep in mind that qw features a single space between the terms. I couldn't help but wondering if a solution would have been possible if there were variances in spacing i.e. if qw had equaled the below:
>>> qw = "hoho hihi haha hede rara a rere titi so whdhdskhdshede wekjewhkwqjhededjfjfj so kjkfdjkdnekjdhide b hede kdjkdld"
.
Thank you guys for all of the help.
Also, in every thread I've read a variation on "^(?!hede).*$" or "^(?!.foo)." has come up for multiline posts. This doesn't work well in Python of course, but I've tried fooling around with these to no avail.
Thank you guys so much for the help!