0

I got an unexpected behaviour, at least for me, while working with pandas dataframe. I use a 2d array to check if the values are smaller than. If I check the entire Dataframe at once, the check is wrong at some values. But if I check explicitly the concerned cell the result is correct.

print(df.loc[5,397])
out: 14.4  #--> its actually 14.3999996185

print(df.loc[5,397] < 14.4)
out: True

print(df.loc[4:6,396:398] < 14.4)
out:
     396    397    398
4  False  False  False
5  False  False  False   #[5,397] should be True!
6  False  False  False

However, if I try to reproduce the error, I got the correct result?!

data = numpy.array([[15,15,15], [15,14.3999996185,15], [15,15,15]])
df = pd.DataFrame(data)

print(df.loc[1,1])
out: 14.3999996185
print(df.loc[1,1] < 14.4)
out: True
print(df < 14.4)
out:
       0      1      2
0  False  False  False
1  False   True  False
2  False  False  False

Thank you

Martin
  • 63
  • 5
  • i think the problem is you index... check `df.index` – ansev Feb 28 '22 at 18:31
  • 2
    You cannot use exact comparisons with floating point numbers. They are approximations. The number is probably not 14.3999996185, but 14.3999961850001 or something. You will have to use a "close to" function. – Tim Roberts Feb 28 '22 at 18:31
  • 1
    What is the output of `df.loc[4:6,396:398].to_dict()`? – mozway Feb 28 '22 at 18:36
  • @Tim but OP is using inequality here (ie. **not** an exact comparison), so the exact value shouldn't matter – mozway Feb 28 '22 at 18:40
  • The output as dict is: `{396: {4: 16.066667556762695, 5: 16.200000762939453, 6: 16.600000381469727}, 397: {4: 16.33333396911621, 5: 14.399999618530273, 6: 16.600000381469727}, 398: {4: 16.399999618530273, 5: 16.933332443237305, 6: 16.866666793823242}}` – Martin Feb 28 '22 at 18:45
  • If `print(df.loc[1,1])` in the second set prints 14.399996185, why does `print(df.loc[5,397])` in the first set print 14.4? Something doesn't add up. On a whim, if you use `<= 14.4`, does it show up? – Tim Roberts Feb 28 '22 at 18:47
  • 1
    Can't replicate the problem. – Scott Boston Feb 28 '22 at 18:49
  • I guess its something related to https://stackoverflow.com/questions/43217916/pandas-data-precision. But I think, this is just an optoin for displaying... – Martin Feb 28 '22 at 18:50

0 Answers0