There is something wrong in the code I am using for selecting all the rows which contains one of the strings in the following list:
search_query=['great game', 'gran game']
filtered_query=df[(df['Text'].str.lower().str.contains("|", search_query)) | (df['Low_Content'].str.contains("|", search_query))]
filtered_query.drop_duplicates(subset =["User", "Low_Content"], keep = False, inplace = True)
The code above should filter all the rows which contain at least one of the two strings in the list:
User Text Low_Content
432 Great game!I liked it We played yesterday
34 Good game, man. I like this sport
412 We played a GREAT GAME yesterday Gran game!!!
The code should select only these rows
User Text Low_Content
432 Great game!I liked it We played yesterday # it contains Great game in Text
412 We played a GREAT GAME yesterday Gran game!!! # this contains both queries in both columns
I am not interested in finding either great or game: I would like to find both words (same for gran game).
The code above seems to select rows if they contains one of the two words, and not one of the two strings.
I would appreciated your help. Thanks