So I'm used to combining dataframe masks like so:
final_mask = mask1 & mask2
But what if I want to combine many masks? For example, the list:
[mask1, mask2, mask3, mask4, ..., mask20]
So I'm used to combining dataframe masks like so:
final_mask = mask1 & mask2
But what if I want to combine many masks? For example, the list:
[mask1, mask2, mask3, mask4, ..., mask20]
You can use pandas cookbook solution, last paragraph with reduce
:
df = pd.DataFrame({'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]})
print (df)
AAA BBB CCC
0 4 10 100
1 5 20 50
2 6 30 -30
3 7 40 -50
mask1 = df.AAA <= 5.5
mask2 = df.BBB == 10.0
mask3 = df.CCC > -40.0
masks = [mask1, mask2, mask3]
mask = functools.reduce(lambda x,y: x & y, masks)
print (df[mask])
AAA BBB CCC
0 4 10 100
Another solution from ayhan
comment working on 1d
mask (masks are Series
):
mask = np.logical_and.reduce(masks)
print (df[mask])
AAA BBB CCC
0 4 10 100
As ayhan pointed, first solution also works with 2D
masks:
mask1 = df <= 5.5
mask2 = df < 1.0
mask3 = df > -40.0
masks = [mask1, mask2, mask3]
mask = functools.reduce(lambda x,y: x & y, masks)
print (mask)
AAA BBB CCC
0 False False False
1 False False False
2 False False True
3 False False False
mask = np.logical_and.reduce(masks)
print (mask)
ValueError: cannot copy sequence with size 4 to array axis with dimension 3
Probably, the easiest way is using a for
loop:
final_mask = mask1
for mask in [mask2, mask3, mask4]:
final_mask = final_mask & mask
Notice though that, although it is easy to understand, this way may not be considered the most "pythonic" way. Using reduce
, as pointed by others, makes your code shorter but may not be easier to read by novices.