1

I have a dataframe like this :

cols = [ 'a','b']
df = pd.DataFrame(data=[[NaN, -1, NaN, 34],[-32, 1, -4, NaN],[4,5,41,14],[3, NaN, 1, NaN]], columns=['a', 'b', 'c', 'd'])

I want to retrieve all rows, when the columns 'a' and 'b' are non-negative but if any of them or all are missing, I want to keep them.

The result should be

   a   b   c   d
2  4   5  41  14
3  3 NaN   1 NaN

I've tried this but it doesn't give the expected result.

df[(df[cols]>0).all(axis=1) | df[cols].isnull().any(axis=1)]
dooms
  • 1,537
  • 3
  • 16
  • 30

1 Answers1

5

IIUC, you actually want

>>> df[((df[cols] > 0) | df[cols].isnull()).all(axis=1)]
   a   b   c   d
2  4   5  41  14
3  3 NaN   1 NaN

Right now you're getting "if they're all positive" or "any are null". You want "if they're all (positive or null)". (Replace > 0 with >=0 for nonnegativity.)

And since NaN isn't positive, we could simplify by flipping the condition, and use something like

>>> df[~(df[cols] <= 0).any(axis=1)]
   a   b   c   d
2  4   5  41  14
3  3 NaN   1 NaN
DSM
  • 342,061
  • 65
  • 592
  • 494