I have a text which looks like an email body as follows.
To: Abc Cohen <abc.cohen@email.com> Cc: <braggis.mathew@nomail.com>,<samanth.castillo@email.com> Hi
Abc, I happened to see your report. I have not seen any abnormalities and thus I don't think we
should proceed to Braggis. I am open to your thought as well. Regards, Abc On Tue 23 Jul 2017 07:22
PM Tony Stark wrote:
Then I have a list of key words as follows.
no_wds = ["No","don't","Can't","Not"]
yes_wds = ["Proceed","Approve","May go ahead"]
Objective:
I want to first search the text string as given above and if any of the key words as listed above is (or are) present then I want to extract the strings in between those key words. In this case, we have Not
and don't
keywords matched from no_wds
. Also we have Proceed
key word matched from yes_wds
list. Thus the text I want to be extracted as list as follows
txt = ['seen any abnormalities and thus I don't think we should','think we should']
My approach:
I have tried
re.findall(r'{}(.*){}'.format(re.escape('|'.join(no_wds)),re.escape('|'.join(yes_wds))),text,re.I)
Or
text_f = []
for i in no_wds:
for j in yes_wds:
t = re.findall(r'{}(.*){}'.format(re.escape(i),re.escape(j)),text, re.I)
text_f.append(t)
Didn't get any suitable result. Then I tried str.find()
method, there also no success.
I tried to get a clue from here.
Can anybody help in solving this? Any non-regex solution is somewhat I am keen to see, as regex at times are not a good fit. Having said the same, if any one can come up with regex based solution where I can iterate the lists it is welcome.