Remove all rows having a specific value in dataframe

Question

This might be a duplicate question as I couldn't find the one with an explanation.

I have a dataframe and I want to remove all rows having 999.9 values. But somehow couldn't do it.

df1 = pd.DataFrame({'a':[999.9,999.9,1,3,2], 'b':[999.9,1,999.9,1, 2]})
df1 = df1.loc[(df1 != 999.9).any(1)]
print(df1)

In this case, I would like to drop the 1st and 2nd rows.

Your condition currently says keep rows (`axis=1`) where _any_ value does not equal (`!=`) 999.9. The first row is kept because b != 999.9, and the second row is kept because a != 999.9. If you had a row where _both_ a and b were 999.9 this would be removed. `df1.loc[(df1 != 999.9).all(axis=1)]` (keep rows where all values are not equal to 999.9) or the equivalent `df1.loc[~(df1 == 999.9).any(1)]` (keep rows where there are __not__ any values which equal 999.9) will work. — Henry Ecker, Oct 09 '21 at 19:51

score 2 · Answer 1 · answered Oct 09 '21 at 13:24

2

Try:

df1 = pd.DataFrame({'a':[999.9,999.9,1,3,2], 'b':[999.9,1,999.9,1, 2]})
print(df1[~(df1==999.9).any(axis=1)])

Output:

     a    b
3  3.0  1.0
4  2.0  2.0

answered Oct 09 '21 at 13:24

Muhammad Hassan

4,079
1
13
27

Thanks. It worked. Could you explain axis=1 in the solution? – aman-aman Oct 09 '21 at 13:25
Read this: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.any.html – Muhammad Hassan Oct 09 '21 at 13:29

I'mahdi · Accepted Answer · 2021-10-09T13:31:43.777

you check df1 != 999.9 in this check only first row delete because for other row you have column that != 999.9.

Try this:

>>> mask = (df1 == 999.9).any(1)
>>> df1[~mask]


# for more explanation

>>> df1 == 999.9
       a    b
0   True    True
1   True    False
2   False   True
3   False   False
4   False   False

in your solution:

>>> (df1 != 999.9)
       a    b
0   False   False
1   False   True
2   True    False
3   True    True
4   True    True

>>> (df1 != 999.9).any(axis = 1) # for check rows
0    False
1     True
2     True
3     True
4     True
dtype: bool

>>> (df1 != 999.9).any(axis = 0) # for check columns
a    True
b    True
dtype: bool

Remove all rows having a specific value in dataframe

2 Answers2