0

Having a dataframe as follows:

df= pd.DataFrame({'category':['Fishing','Refrigeration','store'],'synonyms_text':['seafood','foodlocker',' food']})

And a list as follows:

list_desc=['FOOD', 'GROWERS', 'INTERNATIONAL']

How can I iterate over the list_desc to create a dynamic regular expression to be used in the dataframe?

for word in list_desc:
    print(word.lower())
    df_tmp= df.loc[df['synonyms_text'].str.contains(r'\bfood\b')]

Where food has to be substituted by word variable.

Thanks

John Barton
  • 1,581
  • 4
  • 25
  • 51

1 Answers1

0

You can just construct your regex dynamically using format() like in r'\b{0}\b'.format(word)

Example:

for word in list_desc:
    df_tmp= df.loc[df['synonyms_text'].str.contains(r'\b{0}\b'.format(re.escape(word.lower())))]

More info: How to use a variable inside a regular expression?

Rithin Chalumuri
  • 1,739
  • 7
  • 19