So basically I want to create a function that takes in a bunch of strings, checks if a particular column has that string then returns a boolean expression. I can easily do this with a single string. But I'm stumped on how to do it as a list of strings.
# Single String Example
def mask(x, df):
return df.description.str.contains(x)
df[mask('sql')]
# Some kind of example of what I want
def mask(x, df):
return df.description.str.contains(x[0]) & df.description.str.contains(x[1]) & df.description.str.contains(x[2]) & ...
df[mask(['sql'])]
Any help would be appreciated :)
So it looks like I figured out a way to do it, little unorthodox but seems to be working anyway. Solution below
def mask(x):
X = np.prod([df.description.str.contains(i) for i in x], axis = 0)
return [True if i == 1 else False for i in X]
my_selection = df[mask(['sql', 'python'], df)]