1

This might be a duplicate question as I couldn't find the one with an explanation.

I have a dataframe and I want to remove all rows having 999.9 values. But somehow couldn't do it.

df1 = pd.DataFrame({'a':[999.9,999.9,1,3,2], 'b':[999.9,1,999.9,1, 2]})
df1 = df1.loc[(df1 != 999.9).any(1)]
print(df1)

enter image description here

In this case, I would like to drop the 1st and 2nd rows.

I'mahdi
  • 23,382
  • 5
  • 22
  • 30
aman-aman
  • 152
  • 1
  • 9
  • Your condition currently says keep rows (`axis=1`) where _any_ value does not equal (`!=`) 999.9. The first row is kept because b != 999.9, and the second row is kept because a != 999.9. If you had a row where _both_ a and b were 999.9 this would be removed. `df1.loc[(df1 != 999.9).all(axis=1)]` (keep rows where all values are not equal to 999.9) or the equivalent `df1.loc[~(df1 == 999.9).any(1)]` (keep rows where there are __not__ any values which equal 999.9) will work. – Henry Ecker Oct 09 '21 at 19:51

2 Answers2

2

Try:

df1 = pd.DataFrame({'a':[999.9,999.9,1,3,2], 'b':[999.9,1,999.9,1, 2]})
print(df1[~(df1==999.9).any(axis=1)])

Output:

     a    b
3  3.0  1.0
4  2.0  2.0
Muhammad Hassan
  • 4,079
  • 1
  • 13
  • 27
1

you check df1 != 999.9 in this check only first row delete because for other row you have column that != 999.9.

Try this:

>>> mask = (df1 == 999.9).any(1)
>>> df1[~mask]


# for more explanation

>>> df1 == 999.9
       a    b
0   True    True
1   True    False
2   False   True
3   False   False
4   False   False

in your solution:

>>> (df1 != 999.9)
       a    b
0   False   False
1   False   True
2   True    False
3   True    True
4   True    True

>>> (df1 != 999.9).any(axis = 1) # for check rows
0    False
1     True
2     True
3     True
4     True
dtype: bool

>>> (df1 != 999.9).any(axis = 0) # for check columns
a    True
b    True
dtype: bool
I'mahdi
  • 23,382
  • 5
  • 22
  • 30