3

Getting trouble with either of errors ; writer.book=book AttributeError: can't set attribute 'book' or BadZipFile

for code not giving badzipfile error I put code line which writes excel file first, dataOutput=pd.DataFrame(dictDataOutput,index=[0]) but, even though I cannot get rid of writer.book = book AttributeError: can't set attribute 'book' As one of SO answer suggests I need to bring back openpyxl to previous versions, or to work with CSV file not excel. I think that's not the solution. There should be solution which I could not get in

dataOutput=pd.DataFrame(dictDataOutput,index=[0])
dataOutput.to_excel('output.xlsx') 'output.xlsm'
book = load_workbook('output.xlsx') 'output.xlsm'
writer = pd.ExcelWriter('output.xlsx')OR'output.xlsm'#,engine='openpyxl',mode='a',if_sheet_exists='overlay')
writer.book = book
writer.sheets = {ws.title: ws for ws in book.worksheets}

for sheetname in writer.sheets:
    dataOutput.to_excel(writer,sheet_name=sheetname, startrow=writer.sheets[sheetname].max_row, index = False,header= False)

writer.save()

I looked for an answer in enter link description here and in detailed solution of attributeError in enter link description here

---I tried another way

with pd.ExcelWriter('output.xlsx', mode='a',if_sheet_exists='overlay') as writer:
    dataOutput.to_excel(writer, sheet_name='Sheet1')
    writer.save()

But this time gave another error

FutureWarning: save is not part of the public API, usage can give in unexpected results and will be removed in a future version

writer.save()

after @Andrew's comments I chaned my code to this way;

with pd.ExcelWriter('outputData.xlsm', engine='openpyxl', mode='a', if_sheet_exists='overlay') as writer:
    book = load_workbook('outputData.xlsm', keep_vba=True)

    writer.book = book
    writer.sheets = {ws.title: ws for ws in book.worksheets}
    current_sheet = book['Sheet1']
    Column_A = current_sheet['A']
    maxrow = max(c.row for c in Column_A if c.value is not None)

    for sheetname in writer.sheets:
        AllDataOutput.to_excel(writer, sheet_name=sheetname, startrow=maxrow, index=False, header=False)
Vitalizzare
  • 4,496
  • 7
  • 13
  • 32
xlmaster
  • 659
  • 7
  • 23

1 Answers1

7

There are a few things that are unclear about your question, but I'll do my best to try and answer what I think you're trying to do.

First off, things that won't work:

  1. From the to_excel docs, "If you wish to write to more than one sheet in the workbook, it is necessary to specify an ExcelWriter object", so when you run dataOutput.to_excel('output.xslx'), you're creating an excel workbook with only a single sheet. Later, it seems like you're trying to write multiple sheets, but you'll only be able to write a single sheet since that's all that's in the workbook to begin with.
  2. From the ExcelWriter docs, .save() is deprecated and .close() should be used instead. But it's recommended to use a context manager (with ... as ... :) so you can entirely avoid having to do that.
  3. By default ExcelWriter uses w (write) mode, so the BadZipFile error that you were running into earlier was likely related to the order that things were run. If you had written a book with to_excel, then loaded it with openpyxl, then used ExcelWriter, you'd write, load, then immediately overwrite the file with a blank one. If you then tried loading the book again without recreating it with to_excel, you'd be trying to load an empty file, which is why that error was thrown.

The answers that you linked to in your question seem to use a much older version of pandas with much different behavior surrounding interaction with Excel files. The latest versions seem to simplify things quite a bit!

As an example solution, I'm going to assume you have two DataFrames called df1 and df2 that you'd like to write to Sheet_1 and Sheet_2 in output.xlsx, and you'd like to update the data in those sheets (using the overlay option) with additional DataFrames called df3 and df4. To confirm, I have openpyxl = 3.0.10 and pandas = 1.5.0 installed.

As you have things written currently, there aren't any updates in dataOutput that would be reflected in output.xslx, but I'm going to create a dictionary called df_dict that can hold the DataFrames and be updated based on the sheets they should be written to. I'm sure you can adjust to suit your needs using whatever data structure you prefer!

df1 = pd.DataFrame(data = {'A': ['a', 'b', 'c'], 'B': ['g','h','i']})
df2 = pd.DataFrame(data = {'C': ['o', 'p', 'q'], 'D': ['u','v','w']})

df_dict = {'Sheet_1': df1, 'Sheet_2': df2}

with pd.ExcelWriter('output.xlsx') as writer:
    for sheetname, dfname in df_dict.items():
        dfname.to_excel(writer, sheet_name = sheetname)

# Updated data is put in DataFrames
df3 = pd.DataFrame(data = {'A': ['a', 'b', 'c', 'd', 'e', 'f'], 'B': ['g', 'h', 'i', 'j', 'k', 'l']})
df4 = pd.DataFrame(data = {'C': ['o', 'p', 'q', 'r', 's', 't'], 'D': ['u', 'v', 'w', 'x', 'y', 'z']})

# df_dict is updated with the new DataFrames
df_dict = {`Sheet_1`: df3, 'Sheet_2': df4}

# This block now uses the name of the sheet from the workbook itself to get the updated data.
# You could also just run the same block as above and it would work.
with pd.ExcelWriter('output.xlsx', mode = 'a', if_sheet_exists = 'overlay') as writer:
    for sheet in writer.sheets:
        df_dict[sheet].to_excel(writer, sheet_name = sheet)
Andrew
  • 789
  • 1
  • 6
  • I have only one sheet named"Sheet1" and trying to append data to it. Later, I will bring pandas 1.5.0 version and give a try to your code, and let you know. About close() I did not know, maybe that's gibing on 'excel file corrupted' sometimes when vba button on file is moved. I will try and let you know. Thanks, I f you need to edit your code based on my comment, you can do it, because later I will read it many times to get into – xlmaster Oct 19 '22 at 05:46
  • after close() my excel file seems deactivate macro code. I cannot run it after running script – xlmaster Oct 19 '22 at 06:42
  • Are you sure with'From the ExcelWriter docs, .save() is deprecated and .close() should be used instead.' it corrupts xlsm file – xlmaster Oct 19 '22 at 19:43
  • That's what it says in the docs, so I'd assume that's the case. Perhaps it still works though. If you set things up like I've done in my answer, you don't need to save or close, as the file will do so automatically when it exits the context manager. Are you still trying to do it a different way? – Andrew Oct 19 '22 at 22:42
  • 1
    From the `ExcelWriter` docs again, you can pass arguments to the engine via the `engine_kwargs` argument. `ExcelWriter` uses the `openpyxl.reader.excel.load_workbook` method when `openpyxl` is the engine used. From [`those docs`](https://openpyxl.readthedocs.io/en/latest/api/openpyxl.reader.excel.html), `keep_vba` is an argument, so adding `engine_kwargs = {keep_vba: True}` (and make sure that you have `engine = 'openpyxl'` there as well) to your `pd.ExcelWriter()` arguments (for any instance that you use `ExcelWriter`) and it should work. – Andrew Oct 19 '22 at 22:50