Panda Dataframe Floating Point Comparison Issue

Question

I have a dataset that looks like this, where the third column is derived by dividing the first column by the second:

    A_CLOSE_PRICE   B_CLOSE_PRICE   A_CLOSE_PRICE/B_CLOSE_PRICE
0          113.55            0.00                           inf
1           97.85           80.00                      1.223125
2           60.00           70.00                      0.857143
3           51.65           51.65                      1.000000
4           53.50             NaN                           NaN
5             NaN         1649.60                           NaN
6           40.00           40.50                      0.987654
7            1.10            1.00                      1.100000

As I want to display the rows containing more than a 10% difference, I run this:

(df['A_CLOSE_PRICE/B_CLOSE_PRICE'] - 1 ).abs() > 0.1

But the last row as shown below returns me "True" instead of "False", which looks to me like a floating point issue. Does anyone know what should be the proper handling for this so I can get the correct results?

0     True
1     True
2     True
3    False
4    False
5    False
6    False
7     True

`1.100000-1` is `0.10000000000000009` instead of `0.1` – Davinder Singh Mar 28 '21 at 09:03 — Davinder Singh, Mar 28 '21 at 09:03

anky · Accepted Answer · 2021-03-28T10:55:48.723

5

Yes you have a floating point issue, I think you can use the df.pct_change builtin directly on axis=1 with np.isclose to handle floating poinit comparison

s = df[['B_CLOSE_PRICE','A_CLOSE_PRICE']].pct_change(axis=1).iloc[:,-1].abs()
s.gt(0.1) & ~np.isclose(s-0.1,0)

0     True
1     True
2     True
3    False
4    False
5    False
6    False
7    False

edited Mar 28 '21 at 10:55

answered Mar 28 '21 at 09:04

anky

74,114
11
41
70

running this command does the percentage difference using B_CLOSE_PRICE/A_CLOSE_PRICE as oppose to A_CLOSE_PRICE/B_CLOSE_PRICE, which explains the different outcome. The same precision issue still persists when I run "df[['A_CLOSE_PRICE','B_CLOSE_PRICE']].pct_change(periods=-1,axis=1).iloc[:,0].abs() > 0.1" – louis xie Mar 28 '21 at 10:46
@louisxie Ahh yes , sorry, you can check now, i updated the answer based on an existing solution for handling floating point comparison with an almost equality check, explanation is here: https://stackoverflow.com/questions/5595425/what-is-the-best-way-to-compare-floats-for-almost-equality-in-python – anky Mar 28 '21 at 10:55

Panda Dataframe Floating Point Comparison Issue

1 Answers1