I have a dataframe and want to find all rows when one of the columns contains a certain string:
tmp = data_frame[data_frame["DESC"].str.contains(tag, na=False)]
However, assume that tag is a list, and I want the column to contain any string in the list, for example:
tmp = data_frame[(data_frame["DESC"].str.contains(tag[0], na=False)) | (data_frame["DESC"].str.contains(tag[1], na=False))]
Now, assume that I have a list of lists, and tag is an element in it, and I loop through this list of lists:
for tag in tag_list:
tmp = data_frame[(data_frame["DESC"].str.contains(tag[0], na=False)) | (data_frame["DESC"].str.contains(tag[1], na=False))]
---do something with tmp
Further, now assume that tag_list is a list of lists, but each element may have different length, so sometimes tag has 1 element, sometimes 2, sometimes 4, etc. How can I define tmp in a way that it is independent of a fixed length for tag?
Ex:
tmp = pandas.DataFrame(columns=["DESC"])
tmp.loc[0] = ["Hello"]
tmp.loc[1] = ["Hello"]
tmp.loc[2] = ["Hi"]
tmp.loc[3] = ["Good Morning"]
tag = ["Hi", "Hello"]
tmp2 = tmp[(tmp["DESC"].str.contains(tag[0], na=False)) | (tmp["DESC"].str.contains(tag[1], na=False))]