I am looking for a way to check if any column in a subset of dataframe columns contains any string from a list of strings. I want to know if there is a better way to do it than using apply?
df = pd.DataFrame({'col1': ['cat', 'dog', 'mouse'],
'col2': ['car', 'bike', '']})
def check_data(df, cols, strings):
for j in cols:
if df[j] in strings:
return 1
else:
return 0
df['answer'] = df.apply(check_data, cols=['col1'], strings=['dog', 'cat'], axis=1)
Output:
col1 col2 answer
cat car 1
dog bike 1
mouse 0
df['answer'] = df.apply(check_data, cols=['col1', 'col2'], strings=['bike', 'mouse'], axis=1)
Output2:
col1 col2 answer
cat car 0
dog bike 1
mouse 1
This gives the desired output but I want to know if there is a better more pythonic way to do this without applying the function to each row of the data? Thanks!