How to take a list of items and create a condition using them all

Question

So basically I want to create a function that takes in a bunch of strings, checks if a particular column has that string then returns a boolean expression. I can easily do this with a single string. But I'm stumped on how to do it as a list of strings.

# Single String Example
def mask(x, df):
    return df.description.str.contains(x)
df[mask('sql')]

# Some kind of example of what I want
def mask(x, df):
    return df.description.str.contains(x[0]) & df.description.str.contains(x[1]) & df.description.str.contains(x[2]) & ...
df[mask(['sql'])]

Any help would be appreciated :)

So it looks like I figured out a way to do it, little unorthodox but seems to be working anyway. Solution below

def mask(x):
    X = np.prod([df.description.str.contains(i) for i in x], axis = 0)
    return [True if i == 1 else False for i in X]
my_selection = df[mask(['sql', 'python'], df)]

Possible duplicate of [pandas dataframe str.contains() AND operation](https://stackoverflow.com/questions/37011734/pandas-dataframe-str-contains-and-operation) — Chris, Jul 29 '19 at 02:06

score 1 · Answer 1 · answered Jul 29 '19 at 02:02

1

Try using:

def mask(x, df):
    return df.description.str.contains(''.join(map('(?=.*%s)'.__mod__, x)))
df[mask(['a', 'b'], df)]

The (?=.*<word>) one after another is really an and operator.

answered Jul 29 '19 at 02:02

U13-Forward

69,221
14
89
114

Not sure why but it's giving a slightly different answer to if I was to try doing it manually – H K Jul 29 '19 at 02:14
@HK Strange, how come? – U13-Forward Jul 29 '19 at 02:14
@HK Can you try it again? – U13-Forward Jul 29 '19 at 02:15
X = df.description.str.contains('sql'), Y = df.description.str.contains('python'), (X&Y).sum() --- That gives me 31 rows, whereas the function for some reason gives 27 rows – H K Jul 29 '19 at 02:18

score 0 · Accepted Answer · answered Jul 29 '19 at 04:13

Managed to work out a solution here:

def mask(x):
    X = np.prod([df.description.str.contains(i) for i in x], axis = 0)
    return [True if i == 1 else False for i in X]
mine = df[mask(['sql', 'python'], df)]

A little unorthodox so if anyone has anything better will be appreciated

How to take a list of items and create a condition using them all

2 Answers2