0

Hey Guys I am searching for a fast/efficient way to extract keywords (defined in a list) from a String (in a Dataframe) without being case sensitive or dependent on " " chars:

keys = ['I', 'love', 'Cookies']

String from df= "xxxxxxxxIxx xx cookies"

result should by either ['I'] or ['I', 'Cookies']

I am currently using f"({'|'.join(keys)}) which is case sensitive. What would you recommend for long strings in even longer dataframes :)

Thanks in advance

J CB
  • 25
  • 4
  • 1
    `"xxxxxxxxIxx xx cookies"` What do the x's represent here? – John Gordon Dec 03 '22 at 18:02
  • more or less random chars – J CB Dec 03 '22 at 18:10
  • So if they enter "calamity", you want to (somehow) know that the "i" is significant, and not all the other letters? I'm not sure that's possible... – John Gordon Dec 03 '22 at 18:18
  • Okay my bad: Lets say that the keywords are well chosen and way longer than 1 char! The result should either be the first detected key from the list or all detected keys – J CB Dec 03 '22 at 18:26

1 Answers1

1

Working code as per your inputs:

my_str ="xxxxxxxixxx xx cookhes"
my_list = ["I", "love", "Cookies"]
if any(substring.casefold() in my_str.casefold() for substring in my_list):
    print('Contains element')
else:
    print('Not contain any element.')

More info on the following answer from StackOverflow: Case insensitive 'in'

mago
  • 41
  • 4