1

Currently, I am exporting two data frames to a single excel file. Each data frame is placed on separate sheets. I am currently running a simulation and need 2,000 columns of each data frame (2,000 on each sheet). Every time I run my code, it creates a new excel file (what I originally intended). But to save time, I was wondering if it would be possible to write onto an existing excel file by adding new columns of data without writing over the existing data?

######### This is my original code ################
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile
from pandas import DataFrame
from openpyxl import Workbook

df1 = pd.DataFrame(returns1)
df2 = pd.DataFrame(returns2)


x1 = 'mypath/ '
x2 = datetime.datetime.now().strftime('%B %d, %Y %H-%M-%S')
x3 = '.xlsx'
destination = x1 + x2 + x3
writer = pd.ExcelWriter(destination, engine='xlsxwriter')

df1.to_excel(writer, sheet_name= 'Returns 1', index=False)
df2.to_excel(writer, sheet_name= 'Returns 2', index=False)
writer.save()
### Update

Code works! Thank you for everyone that helped, especially @zhqiat. Below is the final code. Does exactly what I wanted. I hope this will help others that run into the same issue I did.

df1 = pd.DataFrame(returns1)
df2 = pd.DataFrame(returns2)

book = load_workbook('mypath.xlsx')
writer = pd.ExcelWriter('mypath.xlsx', engine='openpyxl')
writer.book = book
writer.sheets = {ws.title: ws for ws in book.worksheets}

df1.to_excel(writer, sheet_name='Returns1', startrow=0, startcol=writer.sheets['Returns1'].max_column, index=False)
df2.to_excel(writer, sheet_name='Returns2', startrow=0, startcol=writer.sheets['Returns2'].max_column, index=False)

This is what I wanted.
 A
100
120
119
225

Second Time
 A     B
100    98
120   100
119   105
125   111

Third Time
 A     B     C
100    98    106 
120   100    99
119   105    101
125   111    89

and so on...

quinnvdk
  • 13
  • 1
  • 6
  • 1
    `Please also forgive me as I am fairly new to python.` You wrote a good question, well-formatted and precise manner. Be with StackOverflow code of conduct and no need of apologies. – Kiran Maniya Sep 11 '19 at 19:00
  • 1
    Why don't you just read the excel file and store it as a dataframe and append the series and write to the same excel sheet? – aunsid Sep 12 '19 at 20:23
  • @aunsid could you provide an example on how I could do that? – quinnvdk Sep 12 '19 at 20:33
  • Just use the openpyxl function for converting a dataframe to rows. – Charlie Clark Sep 13 '19 at 10:27
  • You're looping over the sheets in the workbook without it being necessary and without a dynamic loop. Inside your `for sheetname in writer.sheets:` loop, you are writing the output of df1 and df2 to the sheets. But for each sheet in the workbook, you execute the same commands (without any change) and this is how your output ends up getting duplicated. – zglin Sep 15 '19 at 19:33

2 Answers2

1

Unfortunately appending to excel isn't a fully fledged feature in pandas.

In your case for a hacked together solution, you can use an excel writer object to stitch the sheets together.

It sounds like you want to add columns for all new data (not row) so you would likely need to determine the width of your spreadsheet with

maxcol = writer.sheets['SheetName'].max_column

** Edit, my bad, it is max_column not max_col **

Refer to a prior question from 2017 for the full code on how to add rows.

zglin
  • 2,891
  • 2
  • 15
  • 26
  • Wow!Thank you, that actually worked. However, I ran into another issue. I have explained that issue in the original post under UPDATE. Do you know why it might be duplicating the output? – quinnvdk Sep 12 '19 at 21:52
0

read the excel file and store it as a dataframe and append the series and write to the same excel sheet

# read wherever you have stored the file

prev_df = pd.read_excel('path to file.xlsx')

# convert new series to df
new_df =  pd.DataFrame(var1)

# join
df_to_write = prev_df.join(new_df)

# write to excel
aunsid
  • 397
  • 2
  • 10