I am using python and regex to try to grab all sentences in a list of tweets containing a certain word for each word within a series of a pandas df.
My df stocks_df
contains certain stock names e.g.
Symbol
0 $GSX
1 $NVDA
2 $MBRX
5 $BBBY
6 $DIS
I want all sentences in the tweets that contain these strings. My attempted solution follows another regex question I had: Key error when using regex quantifier python
However my solution mostly grabs sentences the symbol at the start of the sentence and doesn't grab it if in the middle of the sentence. It also seems to match only symbols without getting the rest of the sentence. My code is as such:
pattern2 = r'(?:{}) (?:[^.]*[^.]*\.)'.format("|".join(map(re.escape, stocks_df['Symbol'])))
Does anyone understand why full sentences are not being matched?