0

When converting a pd.DataFrame to a nested list, some values are unprecise.

pd.DataFrame examplary row:

1.0 -3.0 -3.0 0.01 -3.0 -1.0

pd.DataFrame.values.tolist() of this row:

[1.0, -3.0, -3.0, 0.010000000000000009, -3.0, -1.0]

How can this be explained and avoided?

Max J.
  • 153
  • 12
  • 2
    See [Is floating point math broken](https://stackoverflow.com/questions/588004/is-floating-point-math-broken) – G. Anderson Dec 09 '20 at 19:23
  • "some values are unprecise"--> Hmm 0.010000000000000009 matches 0.01 to 1 part in 10^15. Max J., Are you looking for infinite precision? What level of imprecision can the task tolerate? – chux - Reinstate Monica Dec 10 '20 at 01:48

1 Answers1

2

This is because this is the original value. When you display the pd.DataFrame it gets rounded:

df = pd.DataFrame({'a':[1.0, -3.0, -3.0, 0.010000000000000009, -3.0, -1.0]})

    a
0   1.00
1   -3.00
2   -3.00
3   0.01
4   -3.00
5   -1.00
df.values.tolist()
# [[1.0], [-3.0], [-3.0], [0.010000000000000009], [-3.0], [-1.0]]

So it is not tolist()'s problem. It is pd.DataFrame that is rounding the numbers.

Use pandas.set_option("display.precision", x) to set display precision for DataFrame.

Z Li
  • 4,133
  • 1
  • 4
  • 19