Drop rows in pandas dataframe based on columns value

Question

I have a dataframe like this :

cols = [ 'a','b']
df = pd.DataFrame(data=[[NaN, -1, NaN, 34],[-32, 1, -4, NaN],[4,5,41,14],[3, NaN, 1, NaN]], columns=['a', 'b', 'c', 'd'])

I want to retrieve all rows, when the columns 'a' and 'b' are non-negative but if any of them or all are missing, I want to keep them.

The result should be

   a   b   c   d
2  4   5  41  14
3  3 NaN   1 NaN

I've tried this but it doesn't give the expected result.

df[(df[cols]>0).all(axis=1) | df[cols].isnull().any(axis=1)]

score 5 · Accepted Answer · answered Dec 13 '15 at 21:29

IIUC, you actually want

>>> df[((df[cols] > 0) | df[cols].isnull()).all(axis=1)]
   a   b   c   d
2  4   5  41  14
3  3 NaN   1 NaN

Right now you're getting "if they're all positive" or "any are null". You want "if they're all (positive or null)". (Replace > 0 with >=0 for nonnegativity.)

And since NaN isn't positive, we could simplify by flipping the condition, and use something like

>>> df[~(df[cols] <= 0).any(axis=1)]
   a   b   c   d
2  4   5  41  14
3  3 NaN   1 NaN

ha! I was going to answer exactly with the flipped version! – Andy Hayden Dec 13 '15 at 21:32 — Andy Hayden, Dec 13 '15 at 21:32

Drop rows in pandas dataframe based on columns value

1 Answers1

Linked