48

I'm trying to open both an xlsx file and an xlsm file both give me the same error

badzipfile: file is not a zip file

here is what I'm typing:

import openpyxl
wb=openpyxl.load_workbook('c:\\users\\me\\documents\\filename.xlsm', keep_vba=True)
wb2=openpyxl.load_workbook('c:\\users\\me\\documents\\filename2.xlsx')

both load_workbook commands result in the same error. They both exist at that location. why am I getting this error?

Mustafa
  • 977
  • 3
  • 12
  • 25
zw1ck
  • 493
  • 1
  • 5
  • 10
  • `They both exist at that location.` But are they zip files? – Pekka Nov 23 '15 at 14:29
  • Does the user running the code have access to files at those locations. Also check whether the files are zipfile using the `zipfile` library. – Charlie Clark Nov 23 '15 at 15:15
  • I'm stuck with this problem in Ubuntu server. Works fine on windows, but when I deploy to my server, I got this. I've tried the answers of crussell, ericksonla and Manish Chaudhary without success. My .xlsx file is not password protected. – C. S. F. Junior May 07 '20 at 13:06
  • @Pekka Excel files are zip files – Steven Oct 08 '21 at 01:30

15 Answers15

21

The same problem occurred to me, and then I noticed the following:

When I created the .xlsx file from file manager, by creating new document with .xlsx format, I had the same error. But when I created the file with some spreadsheet software (libreoffice-calc) in my case, it just worked fine. I hope this may help.

20

If you call openpyxl.load_workbook with file-like object, be sure to open it in binary mode.

Denis Eliseev
  • 491
  • 4
  • 8
19

It's because you created empty .xlsx with no metadata which is an empty file with no cells formatting. use exel or equivalent spreadsheet software to save empty file in that directory, which will create an xlsx file with cell formating, after saving try to load file using openpyxl

7

I did something really stupid and got the same error. Basically, today was my first time trying this and I got it to work with the 'Automate' example and then tried my Excel. Didn't work! Took me a while to realize the error was due to having workbook password protected. The error doesn't match that at all, but when I removed the protection from the workbook, it worked! What can I say but 'duh' and 'yeah!'?

will johnson
  • 87
  • 1
  • 1
6

The XLSX or XLS or XLSM files you are trying to open are excel recovery files start with "~". you can check by:

for file in path.glob('*.xlsx'):print(file)

you can skip those files by checking,get filename from full path:

filename=str(filename).split("\\")[-1:][0]

checking if the filename starts with "~" as all recovery files will start with "~"

if filename[0]!="~"
5

I'm late to the party, but I received the same error, but for different reasons.

On my mac, when I cut and pasted the .xlsx file into the directory I desired, it didn't actually place the file itself, it placed a symbolic link. That, for some reason, would raise the same error. I remedied this by opening the file, and using "save as".

BryanE
  • 51
  • 1
  • 1
4

The xlsx file may have password and you are not granted to access. Delete the password and you can use openpyxl.load_workbook() to access it.

lucida lee
  • 49
  • 1
  • 1
  • Hi lucida lee, welcome at StackOverflow! Your answer might already have been given by [will johnson|https://stackoverflow.com/a/50475794/5488275] . What you can do, is upvote his answer and, in this case, edit his answer (since it can be improved). That is better than adding a duplicate answer :) – Nander Speerstra Mar 20 '20 at 08:58
3

My hunch is that either you openpyxl version is not the latest (2.3.1) or that there is a problem with your source file. To upgrade to the newest version to openpyxl, use:

pip install openpyxl --upgrade

Is the source file encrypted at all?

crussell
  • 179
  • 4
3

I had the same problem. It is because the file I was trying to open in Python was already running. You need to check if the file you are trying to open in Python is not running in tasks.

BenT
  • 3,172
  • 3
  • 18
  • 38
2

This also happened to me when I pointed openpyxl to an alias of the file instead of the original file.

ericksonla
  • 1,247
  • 3
  • 18
  • 34
1

I had the same issue and thank you all for your hints. My problem was that the excel .xlsx file I was opening was corrupted. Hence openpyxl couldn't open. I had to recreate the file.

Celine
  • 46
  • 2
0

I have been facing this error for a while and so i just uninstalled openpyxl and reinstall version 2.6.3 and it worked well. This might help you too, no need to change anything just run these commands using pip

pip uninstall openpyxl

pip install openpyxl==2.6.3

Hope it helps you.

Manish Chaudhary
  • 498
  • 6
  • 14
  • I've tried this one without success. On windows 10 my project works fine. But when I deploy to Ubuntu 18.04 server, I got the same error message. Is there another version that could solve this? I've tried 2.6.3 and the last one 3.0.3 – C. S. F. Junior May 07 '20 at 13:14
  • there is no other versions in my knowledge till now, but now try using xlwings, i find it quite workable – Manish Chaudhary May 08 '20 at 05:53
  • It's only for MAC or Windows. I got this when I tried to install: ```OSError: xlwings requires an installation of Excel and therefore only works on Windows and macOS. To enable the installation on Linux nevertheless, do: export INSTALL_ON_LINUX=1; pip install xlwings``` – C. S. F. Junior May 08 '20 at 13:20
  • might be because you are using linux system and does not have Excel – Manish Chaudhary May 16 '20 at 11:27
0

In my case I received this error because I was reading a very large file that I had just downloaded. I copied it to another location before it had finished downloading and so got the error when trying to read an incomplete/corrupt file.

0

In my case the file I accessed had been previously opened and not properly closed by the Workbook.save() method. Because I was debugging my code and failed on this command due to an incorrect path the file became corrupt. I had to delete the file since it could not be restored and create a new one.

CodeSocke
  • 91
  • 1
  • 7
0

I had the same problem when i directly changed CSV to XLSX in the file directory. Even though you create the new XLSX it does not include the required meta data.

Manual Suggestion is to open the CSV and export to XLSX within excel itself. If the file is to large to open in Excel itself you can use the below commands.

import pandas as pd
read_file = pd.read_csv (r'FY2021_All_Contracts_Full_20220510_3.csv', low_memory = False)
read_file.to_excel (r'FY2021_All_Contracts_Full_20220510_3', index = None, header=True)

This will programatically convert it to Excel if it is to large to be opened/ you don't want to do it manually.

TrapLordOb
  • 25
  • 4