Filter rows of pandas dataframe whose values are lower than 0

Question

I have a pandas dataframe like this

df = pd.DataFrame(data=[[21, 1],[32, -4],[-4, 14],[3, 17],[-7,NaN]], columns=['a', 'b'])
df

I want to be able to remove all rows with negative values in a list of columns and conserving rows with NaN.

In my example there is only 2 columns, but I have more in my dataset, so I can't do it one by one.

This is what you're looking for `df.loc[(df>0).all(axis=1) | df.isnull().any(axis=1)]` ? — Zero, Dec 12 '15 at 19:12
https://stackoverflow.com/questions/13851535/how-to-delete-rows-from-a-pandas-dataframe-based-on-a-conditional-expression — grabantot, Oct 29 '18 at 08:34

ComputerFellow · Accepted Answer · 2015-12-12T18:14:02.097

50

If you want to apply it to all columns, do df[df > 0] with dropna():

>>> df[df > 0].dropna()
    a   b
0  21   1
3   3  17

If you know what columns to apply it to, then do for only those cols with df[df[cols] > 0]:

>>> cols = ['b']
>>> df[cols] = df[df[cols] > 0][cols]
>>> df.dropna()
    a   b
0  21   1
2  -4  14
3   3  17

edited Dec 12 '15 at 18:14

answered Dec 12 '15 at 18:10

ComputerFellow

11,710
12
50
61

1

In my case it's not all columns only a subset – dooms Dec 12 '15 at 18:13
@dooms I've made one further update, please check the latest revision – ComputerFellow Dec 12 '15 at 18:19
@dooms are you getting an error? or unexpected output? could you detail a bit? – ComputerFellow Dec 12 '15 at 18:35
(I've updated my question) if I use cols=['b'] this will also remove the last row of my dataframe, that's not what I want – dooms Dec 12 '15 at 18:53

score 14 · Answer 2 · answered Feb 12 '19 at 04:59

14

I've found you can simplify the answer by just doing this:

>>> cols = ['b']
>>> df = df[df[cols] > 0]

dropna() is not an in-place method, so you have to store the result.

>>> df = df.dropna()

answered Feb 12 '19 at 04:59

rgahan

667
8
17

This code results in an empty dataframe because column 'a' would be replaced by all NaNs because the filter doesn't include that column. I could modify this answer to include that but then it would be pretty much the same as the other answer. – Zev Apr 20 '20 at 18:48

score 1 · Answer 3 · answered Apr 20 '20 at 19:00

I was looking for a solution for this that doesn't change the dtype (which will happen if NaN's are mixed in with ints as suggested in the answers that use dropna. Since the questioner already had a NaN in their data, that may not be an issue for them. I went with this solution which preserves the int64 dtype. Here it is with my sample data:

df = pd.DataFrame(data={'a':[0, 1, 2], 'b': [-1,0,1], 'c': [-2, -1, 0]})
columns = ['b', 'c']
filter_ = (df[columns] >= 0).all(axis=1)
df[filter_]


   a  b  c
2  2  1  0

Filter rows of pandas dataframe whose values are lower than 0

3 Answers3

Linked