1

So I'm used to combining dataframe masks like so: final_mask = mask1 & mask2

But what if I want to combine many masks? For example, the list: [mask1, mask2, mask3, mask4, ..., mask20]

Matt Takao
  • 2,406
  • 3
  • 16
  • 30

2 Answers2

1

You can use pandas cookbook solution, last paragraph with reduce:

df = pd.DataFrame({'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]})
print (df)
   AAA  BBB  CCC
0    4   10  100
1    5   20   50
2    6   30  -30
3    7   40  -50

mask1 = df.AAA <= 5.5
mask2 = df.BBB == 10.0
mask3 = df.CCC > -40.0

masks = [mask1, mask2, mask3]
mask = functools.reduce(lambda x,y: x & y, masks)

print (df[mask])
   AAA  BBB  CCC
0    4   10  100

Another solution from ayhan comment working on 1d mask (masks are Series):

mask = np.logical_and.reduce(masks)

print (df[mask])
   AAA  BBB  CCC
0    4   10  100

As ayhan pointed, first solution also works with 2D masks:

mask1 = df <= 5.5
mask2 = df < 1.0
mask3 = df > -40.0

masks = [mask1, mask2, mask3]
mask = functools.reduce(lambda x,y: x & y, masks)
print (mask)
     AAA    BBB    CCC
0  False  False  False
1  False  False  False
2  False  False   True
3  False  False  False

mask = np.logical_and.reduce(masks)
print (mask)

ValueError: cannot copy sequence with size 4 to array axis with dimension 3

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

Probably, the easiest way is using a for loop:

final_mask = mask1
for mask in [mask2, mask3, mask4]:
    final_mask = final_mask & mask

Notice though that, although it is easy to understand, this way may not be considered the most "pythonic" way. Using reduce, as pointed by others, makes your code shorter but may not be easier to read by novices.

Matheus Portela
  • 2,420
  • 1
  • 21
  • 32