I'm filtering a column by a regex expression that checks to see if certain phrases from a list exist in the text field:
phrase = ["email was deleted", "click on link", etc.]
df['text'].str.contains(r'\b(?:{})\b'.format('|'.join(sorted(phrase, key=len, reverse=True))), case=False, regex=True)
However, now I'd like to add a condition to exclude any results that are preceded by a list of phrases/words:
neg_phrases = ["did not", "not", "no"]
So I would expect a row with the phrase "Steve told Mary the email was deleted" anywhere in the text to be in the output, however if it was "Steve told Mary no email was deleted", then it shouldn't. Just having trouble with how to work in the negative lookbehind