2

I have a dataframe that looks like:

   a A  a B  a C  a D  a E  a F  p A  p B  p C  p D  p E  p F
0    0    0    0    0    0    0    0    0    0    0    0    0
1    1    0    0    0    0    0    0    0    0    0    0    0
2    0    1    0    0    0    0    0    0    1    0    0    0
3    0    0    1    0    0    1    0    0    0    0    0    0
4    0    0    0    1    0    1    0    0    0    0    0    0
5    0    0    0    0    1    0    0    0    0    0    0    0
6    0    0    0    0    0    0    1    0    0    0    0    0

df = pd.DataFrame({'p A':[0,0,0,0,0,0,1],'p B':[0,0,0,0,0,0,0],'p C':[0,0,1,0,0,0,0],'p D':[0,0,0,0,0,0,0],'p E':[0,0,0,0,0,0,0],'p F':[0,0,0,0,0,0,0],'a A':[0,1,0,0,0,0,0],'a B':[0,0,1,0,0,0,0],'a C':[0,0,0,1,0,0,0],'a D':[0,0,0,0,1,0,0],'a E':[0,0,0,0,0,1,0],'a F': [0,0,0,1,1,0,0]})

Note: This is a much simplified version of my actual data.

a stands for Actual; p stands for Predicted; A - F represent a series of labels

I want to write a query that, for each row in my dataframe, returns True when: (all row values in "p columns" = 0 ) and (at least one row value in "a columns" = 1) i.e. for each row, p columns are fixed at 0 and at least 1 a column = 1.

Using answers to Pandas Dataframe Find Rows Where all Columns Equal and Compare two columns using pandas I achieve this currently by using & and np.any()

((df.iloc[:,6] == 0) & (df.iloc[:,7] == 0) & (df.iloc[:,8] == 0) & (df.iloc[:,9] == 0) & (df.iloc[:,10] == 0) & (df.iloc[:,11] == 0) & df.iloc[:,0:6].any(axis = 1) )

>>
0    False
1     True
2    False
3     True
4     True
5     True
6    False
dtype: bool

Is there a more succinct, readable way I can achieve this?

Community
  • 1
  • 1
Chuck
  • 3,664
  • 7
  • 42
  • 76

1 Answers1

3

You can use ~ for invert boolean mask with iloc for select by position:

print (~df.iloc[:,6:11].any(1) & df.iloc[:,0:6].any(1))
0    False
1     True
2    False
3     True
4     True
5     True
6    False
dtype: bool

Or use filter for select by column names, any for check at least one True or all for check if all values are True per row.

Function eq is for compare with 0.

print (~df.filter(like='p').any(1) & df.filter(like='a').any(1))
0    False
1     True
2    False
3     True
4     True
5     True
6    False
dtype: bool

print (df.filter(like='p').eq(0).all(1) & df.filter(like='a').any(1))
0    False
1     True
2    False
3     True
4     True
5     True
6    False
dtype: bool
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252