0

I'm trying to create a loop that opens 24 csv files and concatenate them into one and create one final csv file.

I tried the following and all works until the to the point where I need to concatenate them...

#Filename
file = '160321-PCU'
fileout = file+'ALL.csv'

#Foor loop to read_csv 24 times - this works... this prints me the dfs
for i in range(1,25):
  filename = file+str(i)+'.csv'
  df = pd.read_csv(gdrive_url+filename, sep=';',
                   names=['Date','Time_Decimal','Parameter','Value'])

  #This is my attempt to concatenate the dfs...
  df_concat = pd.concat([df])

#But as soon as I execute the code below to create ONE csv file to one file, it
#just gives me the 160321-PCU24 df... no concatenate...
df_concat.to_csv(gdrive_url_out+fileout, index=True)
martineau
  • 119,623
  • 25
  • 170
  • 301

3 Answers3

0

The concat function needs to be fed a sequence of objects to be joined. Your code is only passing one object, that is, the dataframe from the most recently read file. See the modified code below, which creates an empty dataframe and concatenates each dataframe read from the csv's:

import pandas as pd

data1 = ['2020', '4.05', 'Param1', 'Val1']
data2 = ['2021', '3.59', 'Param2', 'Val2']

with open('file_1.csv', 'w') as f:
    f.write(','.join(data1))
    
with open('file_2.csv', 'w') as f:
    f.write(','.join(data2))
    
fileout = 'file_3.csv'

df_concat = pd.DataFrame()

for i in range(1,3):
  filename = 'file_' + str(i) + '.csv'
  df = pd.read_csv(filename, sep=',',
                   names=['Date','Time_Decimal','Parameter','Value'])
  df_concat = pd.concat([df_concat, df])

df_concat.to_csv(fileout, index=True)
print(df_concat)
TBaggins
  • 62
  • 5
  • Hi @TBaggins, thanks, I tried and the output didn't seem to be correct. Basically instead of having one row at timestamp 12:00 AM, it has 24 lines of 12:00AM, etc... – Psychefelic Mar 25 '21 at 06:31
  • See edited answer, which includes the contents of the csv files. – TBaggins Mar 26 '21 at 11:48
0

I have changed your code slightly. I can't replicate your issue, but let's hope the below works:

#Filename
file = '160321-PCU'
fileout = file+'ALL.csv'

# Empty list to put in the dfs
li = []

#For loop to read_csv 24 times - this works... this prints me the dfs
for i in range(1,25):
  filename = file+str(i)+'.csv'
  df = pd.read_csv(gdrive_url+filename, sep=';',
                   names=['Date','Time_Decimal','Parameter','Value']
  li.append(df) # add it to the list of dataframes

all_dfs = pd.concat(li, axis=0, ignore_index=True)  # concat all dataframes imported

And then you can export them in your folder:

all_dfs.to_csv(gdrive_url_out+fileout, index=True)

Have a look also here for a similar issue.

sophocles
  • 13,593
  • 3
  • 14
  • 33
0

Instead of

df_concat = pd.concat([df])

use something like:

if df_concat in locals():
    df_concat = pd.concat([df_concat, df])
else:
    df_concat = df

Alternatively in your case you may use:

if i == 1:
    df_concat = df
else:
    df_concat = pd.concat([df_concat, df])
Thomas R
  • 1,067
  • 11
  • 17