1

I am using the following code:

import os
import numpy as np
import pandas as pd
from openpyxl import load_workbook

def dump2ExcelTest(df, fname, sheetNameIn='Sheet1'):
    if os.path.exists(fname):
        writer = pd.ExcelWriter(fname, engine='openpyxl', mode='a')
        book = load_workbook(fname)
        writer.book = book
    else:
        writer = pd.ExcelWriter(fname, engine='openpyxl', mode='w')

    df.to_excel(writer, sheet_name = sheetNameIn)
    writer.save()
    writer.close()

x1 = np.random.randn(100, 2)
df1 = pd.DataFrame(x1)
dump2ExcelTest(df1, r'Y:\summary\test3.xlsx')

On trying to open test3.xlsx I get the following warning window:

enter image description here

However, if I just do df1.to_excel(r'Y:\summary\test3.xlsx') then test3.xlsx opens fine.

I am not sure what to do about this as there is nothing in the log file.

Josh
  • 2,767
  • 1
  • 27
  • 31
Zanam
  • 4,607
  • 13
  • 67
  • 143

1 Answers1

1

I believe the way the ExcelWriter opens the file and tracks existing workbook contents is the problem. I'm not sure exactly what is going on under the hood but you have to both

  • specify the proper startrow for append
  • copy sheet information to the writer

I've used a contextmanager in Python for a little cleaner syntax. This is your example but properly writing and appending as you desire.

import os
import numpy as np
import pandas as pd
from openpyxl import load_workbook

def dump2ExcelTest(df, fname, sheetNameIn='Sheet1'):
    if os.path.exists(fname) is False:
        df.to_excel(fname, engine='openpyxl')

    start_row = 0
    with pd.ExcelWriter(fname, engine='openpyxl', mode='a') as writer:
        writer.book = load_workbook(fname)

        if sheetNameIn not in writer.book.sheetnames:
            raise ValueError(f"sheet {sheetNameIn} not in workbook")

        # grab the proper start row and copy existing sheets to new writer
        start_row = writer.book[sheetNameIn].max_row
        writer.sheets = {ws.title:ws for ws in writer.book.worksheets}

        df.to_excel(writer, sheetNameIn, startrow=start_row, header=False)

x1 = np.random.randn(100, 2)
df1 = pd.DataFrame(x1)

dump2ExcelTest(df1, "test3.xlsx")

More details and similar question here

Josh
  • 2,767
  • 1
  • 27
  • 31