13

Writing a pandas.DataFrame into an Excel Workbook in the .xlsx format is as simple as:

import pandas as pd
df = pd.DataFrame({'firstColumn' : [5, 2, 0, 10, 4], 'secondColumn' : [9, 8, 21, 3, 8]})
print(df)
df.to_excel('test.xlsx')

which gives:

   firstColumn  secondColumn
0            5             9
1            2             8
2            0            21
3           10             3
4            4             8

and the corresponding Excel file.

Is there also a possibility to write the DataFrame into a .xlsm Excel file? This is actually more or less the same as .xlsx, but with the possibility enabled to store VBA macros within the file. I need this because I want to insert and run a VBA macro after creating the file.

However, when trying this on a regular xlsx file, I get the following error message in a pop-up:

The following features cannot be saved in macro-free workbooks: VB project.
To save a file with these features, click No, and then choose a macro-enabled file type in the File Type list.
To continue saving as macro-free workbook, click Yes.

I can then manually choose to save the file as .xlsm which will have my macro included. However, I would prefer to do this automatically without the extra step.

The documentation for the to_excel method suggests that this should be possible (see engine parameter). However, I don't understand how to enable this.

When I simply change the output filename to *.xlsm, a .xlsx file is created which is named .xlsm. When I try to open it, I get

Excel cannot open the file 'myFilename.xlsm' because the file format or file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file.

If I manually change the extension to .xlsx, I can open it again.

Regarding this part of the pandas documentation:

openpyxl: This includes stable support for OpenPyxl 1.6.1 up to but not including 2.0.0, and experimental support for OpenPyxl 2.0.0 and later.`

My version of Openpyxl is 1.8.6. Updating to 2.1.4 did not solve the problem. Neither did updating XlsxWriter from 0.63 to 0.6.6.

Using df.to_excel('test.xlsx', engine='openpyxl') as suggested also did not solve the problem.

jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
Dirk
  • 9,381
  • 17
  • 70
  • 98
  • What happens if you just change the filename to `'test.xlsm'`? According to the [`ExcelWriter` source](https://github.com/pydata/pandas/blob/master/pandas/io/excel.py#L409), `engine` *"defaults to `io.excel..writer`"*. See also http://pandas.pydata.org/pandas-docs/stable/io.html?highlight=excelwriter#excel-writer-engines – jonrsharpe Jan 27 '15 at 11:52
  • 1
    Oh yes, I forgot to mention that... this creates a `xlsx` file which is named `xlsm`. When I try to open it, I get `Excel cannot open the file 'myFilename.xlsm' because the file format or file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file.` If I manually change the extension to `xlsx`, I can open it again. – Dirk Jan 27 '15 at 11:57
  • Regarding `openpyxl: This includes stable support for OpenPyxl 1.6.1 up to but not including 2.0.0, and experimental support for OpenPyxl 2.0.0 and later.`: My version of Openpyxl is 1.8.6. – Dirk Jan 27 '15 at 11:59
  • 1
    Could you edit the question to include this and any other relevant information, please. Have you tried specifying `engine='openpyxl'`? – jonrsharpe Jan 27 '15 at 12:00
  • *"also did not solve the problem"* - nothing changed? Different error? What about `df.to_excel('test.xlsm', engine='openpyxl')`? Have you tried specifying `XlsxWriter`? Have you looked at the [options for Excel](http://pandas.pydata.org/pandas-docs/stable/options.html#list-of-options) (`openpxyl` is already the default for `'.xlsm'`)? – jonrsharpe Jan 27 '15 at 12:38
  • 1
    No, there was no different error - when naming the file `xlsm`, I get a `xlsx` file with wrong extension. Does it work for you by simply naming the file `xlsm` and pandas chooses the right engine? If not, then I guess that pandas' documentation is a bit too simplified on this, especially regarding jmcnamara's (more complicated) suggestion below which did the trick. – Dirk Jan 27 '15 at 13:28
  • 1
    I don't have `pandas` installed here, I'm just trying to encourage you to provide all appropriate troubleshooting information. Glad you've found a working answer, though. – jonrsharpe Jan 27 '15 at 13:30

1 Answers1

16

Pandas requires that a workbook name ends in .xls or .xlsx. It uses the extension to choose which Excel engine to use.

You could pass a temp name and then overwrite it with something like this:

import pandas as pd

df = pd.DataFrame({'First' : [5, 2, 0, 10, 4], 
                   'Second' : [9, 8, 21, 3, 8]})

writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')

df.to_excel(writer, sheet_name='Sheet1')

workbook  = writer.book
workbook.filename = 'test.xlsm'
# !! Won't load in Excel !!

writer.save()

This will create an Excel file with the a .xlsm extension.

However, due to a feature called "extension hardening" Excel won't open this file since it knows that it doesn't contain a macro and isn't actually an xlsm file. (That is the Excel error that you report above.)

You can workaround this with recent versions of XlsxWriter by extracting the VbaProject.bin macro file from a real xlsm file and inserting it into the new file:

import pandas as pd

df = pd.DataFrame({'First' : [5, 2, 0, 10, 4], 
                   'Second' : [9, 8, 21, 3, 8]})

writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')

df.to_excel(writer, sheet_name='Sheet1')

workbook  = writer.book
workbook.filename = 'test.xlsm'
workbook.add_vba_project('./vbaProject.bin')

writer.save()

See the Working with VBA Macros section of the XlsxWriter docs for more information.

jmcnamara
  • 38,196
  • 6
  • 90
  • 108
  • 2
    This is great! By the way, I could not get the `vba_extract.py` script running which you mentioned in the documentation - the source code looks like a HTML file and results in a SyntaxError? Or how is it supposed to be used? However, [as suggested here](http://superuser.com/questions/807926/how-to-bypass-the-vba-project-password-from-excel), one can also simply rename the `xlsm` file to `zip`, open it with any archiving tool and extract the `vbaProject.bin` file from the `xl` folder. Make sure that the `xlsm` file contains at least one macro, otherwise there will be no such file. – Dirk Jan 27 '15 at 13:34
  • 2
    When you install XlsxWriter your build system/packager should install the `vba_extract.py` utility into your path. If not it is also included in the examples directory of the repository. – jmcnamara Jan 27 '15 at 13:41
  • 6
    What if I am trying to write the dataframe into an existing worksheet (xlsm) with lots of macros already. Will it 'corrupt' my existing file? Will I lose my macros? – Toby Booth Apr 02 '18 at 10:29
  • For the people who want to save the new macro-enabled file at a different location, just give the complete path against filename. example: workbook.filename = '/path/of/your/choice/filename.xlsm' – YoungSheldon Jan 01 '20 at 13:32