0

I have a list of srings:

lista = ['Word', 'Apple', 'Banana']

How can I convert this into regex:

reg = "\b(word1|word2|word3)\b"

My target is to speed-up my function (for a dataframe):

df['A'] = df['A'].replace(reg, r'\XXX', regex=True)
PV8
  • 5,799
  • 7
  • 43
  • 87
  • `regex = r'\b(' + "|".join(list) + r')\b'`. but its not a good idea to call your list as `list` as you will overwrite the python built in of list – Chris Doyle Oct 29 '19 at 12:34
  • thx, with the error: multiple repeat at postion , I have to do `regex = re.escape(..) `or? – PV8 Oct 29 '19 at 12:37
  • Sorry i dont get what you mean. When i test with `print(re.findall(regex, "This is an Apple tree and not a Banana tree"))` i get the output `['Apple', 'Banana']` so the regex works – Chris Doyle Oct 29 '19 at 12:42
  • yeah, but this is a sample dataset, If I have different data in it... – PV8 Oct 29 '19 at 12:44
  • I can only work off the sample you provided – Chris Doyle Oct 29 '19 at 12:45

0 Answers0