4

I have 50 CSV files of 20,000 rows each, I allready joined them and then split them by date. My result is a list of dataframes that I want to write each dataframe of the list to one sheet in an output excel. I already try with:

with pd.ExcelWriter('output.xlsx') as writer:
        cont=0
        for x in List:
            x.to_excel(writer,sheet_name="csv_"+str(cont),index=False, engine='xlsxwriter',na_rep="NAN",startrow=1,startcol=1)
            cont+=1

But I get a memory error and it takes forever. So my question is, does anyone know how to write big dataframes to an excel sheet in a really fast way using python? Or I should use another language to write it?

I already have posted another question showing my error: Writing pandas dataframes to excel crash

jps
  • 20,041
  • 15
  • 75
  • 79
  • 1
    Related to: [Python: fastest way to write pandas DataFrame to Excel on multiple sheets](https://stackoverflow.com/questions/25863381/python-fastest-way-to-write-pandas-dataframe-to-excel-on-multiple-sheets) – DarrylG Jun 10 '20 at 19:19
  • You must store and run all 50 CSVs in a single operation? Can't you load, transform, and then export *per file* to reduce the memory footprint of your script? This question isn't really a matter of speed, but more an issue with memory usage I gather. – r.ook Jun 10 '20 at 20:02

1 Answers1

2

To fix your memory error, you have to increment your cont variable;

with pd.ExcelWriter('output.xlsx') as writer:
        cont = 0
        for x in List:
            x.to_excel(writer,sheet_name="csv_"+str(cont),index=False, engine='xlsxwriter',na_rep="NAN",startrow=1,startcol=1)
            cont += 1

A better syntax for this;

with pd.ExcelWriter('output.xlsx') as writer:
        for i, x in enumerate(List):
            x.to_excel(writer,sheet_name="csv_"+str(i),index=False, engine='xlsxwriter',na_rep="NAN",startrow=1,startcol=1)

And check if the file you want to output follows the Excel Specifications.

If all checks out and it is still too slow, you can use a faster excel module.

Sy Ker
  • 2,047
  • 1
  • 4
  • 20