I've got a dataframe that consists of two columns: one for topic the other with an utterance. The utterances are something like "play music", "play Madonna" or "listen Michael Jackson". I have a list with artist names and want to check now if there is an artist name in the cell of the dataframe.
For one-word names I have this solution: (I used spacy for nlp processing)
for row in range(0,nrows):
text = df.loc[row]['utt']
words = nlp(text)
for word in words:
if word.text in artists:
df.loc[row]['utt'] = text.replace(word.text, format_artist(word.text))
if word.text in albums:
df.loc[row]['utt'] = text.replace(word.text, format_album(word.text))
If there is an artist name or album title the word will be replaced with a different format.
The problem is now that it doesn't recognise something like "michael jackson" because it's checking word-wise.
Thanks for the help!