1

I am reading numbers from a csv file into a pandas dataframe. When the numbers I am reading are approximately >1E12, pandas will approximate the number to 3 significant figures. For example, 9215069800000 will be approximated to 9.22E12 (i.e., stop pandas changing 9215069800000 to 9220000000000).

This may not seem like a significant issue, but these small differences, at times, lead to significant issues.

I have tried float_precision='round_trip' but this only accepts decimal places.

I have also tried pd.set_option('display.float_format', lambda x: '%.5f' % x) but this too does not work.

Any help would be greatly appreciated.

Mark Kelly
  • 11
  • 2
  • i think it is already answered here https://stackoverflow.com/questions/56820/round-doesnt-seem-to-be-rounding-properly – Youssef May 06 '22 at 09:26
  • Unfortunately not. The question you have linked is doing the inverse - rounding a number. I am attempting to stop pandas from changing 9215069800000 to 9220000000000 – Mark Kelly May 06 '22 at 09:32
  • How do you check the final value? if on excel, - maybe excel rounds it? – NoobVB May 06 '22 at 13:45
  • I read the number from excel and then use it to write a new file. But when i print the dataframe, or even just the value, it outputs it with the rounding/approximation. – Mark Kelly May 07 '22 at 17:58

2 Answers2

0

Just set float_precision=None parameter to none when reading the csv

import pandas as pd

with pd.option_context('display.precision', 10):
    df = pd.read_csv("test.csv", float_precision=None)
    print(df)
Youssef
  • 1
  • 1
  • Hi Youssef, thanks for taking the time to answer. Unfortunately this does not fix the issue. For example, if i read in test.csv which contains rows of numbers. If the first number is `220766983583.223`, using your fix above, pandas turns the number to `220767000000.0` – Mark Kelly May 07 '22 at 17:57
0

I know this is an old question, but I spent the afternoon yesterday struggling with the same problem.

Turns out, the issue wasn't Pandas, but Excel, which I verified by checking the columns before export, opening the csv in Notepad after export, and checking it again after opening it in Excel. The values weren't changed until after it was saved in Excel.

You can use the below test code to validate. If you change the value in the values list to an int, Excel will round both of them off differently when you open and save. If it's a string, it will be unchanged when opened in an xlsx file. This may vary depending on versions of Excel.

import pandas as pd

values = ['1610202324578508800']
columns = ['values']

test_df = pd.DataFrame(values, columns=columns).astype({'values':object})

test_df.to_csv('test df.csv')
test_df.to_excel('test df.xlsx')