0

I have tried to add multiple csv files. I have followed the below link. how to merge 200 csv files in Python

import pandas as pd
combined_csv = pd.concat( [ pd.read_csv(f) for f in filenames ] )
combined_csv.to_csv( "combined_csv.csv", index=False )

But my values changed from

49.108,55.738,30.106
41.681,54.896,32.99

to

49.108000000000004,55.738,30.105999999999998
41.681000000000004,54.896,32.99

How to prevent this?

Thanks in advance

Bill Huang
  • 4,491
  • 2
  • 13
  • 31
LIsa
  • 29
  • 6
  • 1
    If you're just concatenating, you don't need to read them all into memory. See [**`fileinput.input`**](https://docs.python.org/3/library/fileinput.html#fileinput.input) – Peter Wood Oct 26 '20 at 18:44
  • 1
    Does this answer your question? [float64 with pandas to\_csv](https://stackoverflow.com/questions/12877189/float64-with-pandas-to-csv) – Pranav Hosangadi Oct 26 '20 at 19:07
  • What bothers me is that the value obtained from the database or the web page has two decimal places, and after saving to csv, some values ​​will be followed by a lot of 0. Have you found a good solution? I don't want to use formatting – David Wei Feb 21 '23 at 03:46

3 Answers3

2

Actually, is doing what you asked to. The problem is that you read float numbers and due to the way languages read floating point numbers, they made small variations though it.

In this case, using this code, it shall give what you need.

combined_csv.to_csv( "combined_csv.csv", index=False, float_format='%.3f')
1

You can combine the use of glob, a Python library for working with files, with pandas to organize this data better. glob can open multiple files by using regex matching to get the filenames:

import glob

files = glob.glob("file*.csv")

df_list = []
for filename in files:
  data = pd.read_csv(filename)
  df_list.append(data)

df = pd.concat(df_list)

print(files)
Dejene T.
  • 973
  • 8
  • 14
1

Alternatives to the previous answer using float_format:

import pandas as pd
from decimal import Decimal
from io import StringIO
import sys

data = '''\
a,b,c,d,e,f
49.108,55.738,30.106,41.681,54.896,32.99
94.107,55.739,3.105,41.671,45.897,23.98
'''

f = StringIO(data)
df = pd.read_csv(f)
df.to_csv(sys.stdout, index=False)

df = df.round(decimals=4)
df.to_csv(sys.stdout, index=False)

f.seek(0)
converters = {k: Decimal for k in 'abcdef'}
df = pd.read_csv(f, converters=converters)
df.to_csv(sys.stdout, index=False)