Compare Boolean Row values across multiple Columns in Pandas using & / np.where() / np.any()

Question

I have a dataframe that looks like:

   a A  a B  a C  a D  a E  a F  p A  p B  p C  p D  p E  p F
0    0    0    0    0    0    0    0    0    0    0    0    0
1    1    0    0    0    0    0    0    0    0    0    0    0
2    0    1    0    0    0    0    0    0    1    0    0    0
3    0    0    1    0    0    1    0    0    0    0    0    0
4    0    0    0    1    0    1    0    0    0    0    0    0
5    0    0    0    0    1    0    0    0    0    0    0    0
6    0    0    0    0    0    0    1    0    0    0    0    0

df = pd.DataFrame({'p A':[0,0,0,0,0,0,1],'p B':[0,0,0,0,0,0,0],'p C':[0,0,1,0,0,0,0],'p D':[0,0,0,0,0,0,0],'p E':[0,0,0,0,0,0,0],'p F':[0,0,0,0,0,0,0],'a A':[0,1,0,0,0,0,0],'a B':[0,0,1,0,0,0,0],'a C':[0,0,0,1,0,0,0],'a D':[0,0,0,0,1,0,0],'a E':[0,0,0,0,0,1,0],'a F': [0,0,0,1,1,0,0]})

Note: This is a much simplified version of my actual data.

a stands for Actual; p stands for Predicted; A - F represent a series of labels

I want to write a query that, for each row in my dataframe, returns True when: (all row values in "p columns" = 0 ) and (at least one row value in "a columns" = 1) i.e. for each row, p columns are fixed at 0 and at least 1 a column = 1.

Using answers to Pandas Dataframe Find Rows Where all Columns Equal and Compare two columns using pandas I achieve this currently by using & and np.any()

((df.iloc[:,6] == 0) & (df.iloc[:,7] == 0) & (df.iloc[:,8] == 0) & (df.iloc[:,9] == 0) & (df.iloc[:,10] == 0) & (df.iloc[:,11] == 0) & df.iloc[:,0:6].any(axis = 1) )

>>
0    False
1     True
2    False
3     True
4     True
5     True
6    False
dtype: bool

Is there a more succinct, readable way I can achieve this?

jezrael · Accepted Answer · 2017-03-07T12:08:45.920

3

You can use ~ for invert boolean mask with iloc for select by position:

print (~df.iloc[:,6:11].any(1) & df.iloc[:,0:6].any(1))
0    False
1     True
2    False
3     True
4     True
5     True
6    False
dtype: bool

Or use filter for select by column names, any for check at least one True or all for check if all values are True per row.

Function eq is for compare with 0.

print (~df.filter(like='p').any(1) & df.filter(like='a').any(1))
0    False
1     True
2    False
3     True
4     True
5     True
6    False
dtype: bool

print (df.filter(like='p').eq(0).all(1) & df.filter(like='a').any(1))
0    False
1     True
2    False
3     True
4     True
5     True
6    False
dtype: bool

edited Mar 07 '17 at 12:08

answered Mar 07 '17 at 12:00

jezrael

822,522
95
1,334
1,252

Which one to be the most "Pythonic" / most used in Pandas? – Chuck Mar 07 '17 at 12:14
It is up to you, but I prefer more `filter` solutions, because more dynamic - if some columns are added, solution stil work perfect. – jezrael Mar 07 '17 at 12:15
Ok great. Thanks! – Chuck Mar 07 '17 at 12:15

Compare Boolean Row values across multiple Columns in Pandas using & / np.where() / np.any()

1 Answers1

Linked