1

I'm trying to load and extract data from a CSV with pandas and I'm noticing that it is changing the numbers loaded. How do I prevent this?

I've got a CSV, test.csv:

q,a,b,c,d,e,f
z,0.999211563,0.945548791,0.756781883,0.572315951,1.191243688,0.867855435

Here I load data:

df = pd.read_csv("test.csv")
print(df)

This outputs the following rounded figures:

   q         a         b         c         d         e         f
0  z  0.999212  0.945549  0.756782  0.572316  1.191244  0.867855

What I ultimate want to do is access values by position:

print(df_.iloc[0, [1, 2, 3, 4, 5, 6]].tolist())

But this is adding numbers to some of the figures.

[0.999211563, 0.9455487909999999, 0.7567818829999999, 0.572315951, 1.191243688, 0.867855435]

Pandas is altering my data. How can I stop pandas from rounding, and adding numbers to figures?

Benjamin
  • 683
  • 1
  • 8
  • 22
  • 3
    There are two things at play here. One is that when outputting numbers, pandas will not print the number with full machine precision, but round it to a set number of digits/decimals. This is only appearance and does not affect the stored number itself. Then, the other thing, where "pandas" actually is changing your data. This has to do with how the computer stores floating point numbers. They are stored in binary (two's complement) format, with a number of fractionals. This means that decimals cannot always be exaclty represented. This is not unique to pandas though. – JohanL Aug 23 '19 at 09:18
  • with respect to the comment by @JohanL : https://en.wikipedia.org/wiki/Floating-point_arithmetic#Accuracy_problems – hugovdberg Aug 23 '19 at 09:21
  • If you really want to preserve the variables as they are in the source csv - treat floats as strings would be the shortest answer... – Grzegorz Skibinski Aug 23 '19 at 09:39
  • [use `float_precision='round_trip'` while reading the csv file into pandas.](https://stackoverflow.com/a/68027847/6484358) – Anu Jun 18 '21 at 00:16

1 Answers1

0
import pandas as pd

with pd.option_context('display.precision', 10):
    df = pd.read_csv("test.csv", float_precision=None)
    print(df)
Kostas Charitidis
  • 2,991
  • 1
  • 12
  • 23
  • This could not be approved as an answer! The right answer was in comment about using float_precision='round_trip' – Qas Apr 06 '22 at 06:23