3

I try to read a xlsx into a data frame:

itut_ir = pd.read_excel('C:\\Users\\Administrator\\Downloads\\reportdata.xlsx')

print(itut_ir.to_string())

I receive this:

Traceback (most recent call last): File "C:\Users\Administrator\eclipse-workspace\Reports\GOW\Report.py", line 44, in df = pd.read_excel('C:\Users\Administrator\Downloads\reportdata.xlsx')
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel_base.py", line 304, in read_excel io = ExcelFile(io, engine=engine) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel_base.py", line 824, in init self._reader = self.enginesengine File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel_xlrd.py", line 21, in init super().init(filepath_or_buffer) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel_base.py", line 353, in init self.book = self.load_workbook(filepath_or_buffer) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\pandas\io\excel_xlrd.py", line 36, in load_workbook return open_workbook(filepath_or_buffer) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\site-packages\xlrd_init.py", line 117, in open_workbook zf = zipfile.ZipFile(filename) File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1222, in init self._RealGetContents() File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37\lib\zipfile.py", line 1289, in _RealGetContents raise BadZipFile("File is not a zip file") zipfile.BadZipFile: File is not a zip file

does anybody have an idea? the file does not seem to be broken, I can open it with Excel.

thanks!

*** UPDATE *** the file producing the error is being downloaded from FTP. opening the original file works ... if that gives you a hint :) thanks

Ele
  • 523
  • 2
  • 6
  • 19
  • 1
    no need to use `open`, just the path to the file. Try: `pd.read_excel('C:\\Users\\Administrator\\Downloads\\reportdata.xlsx', sheet_name='Details', skiprows=4)` – Chris Adams Aug 26 '20 at 11:48
  • sorry, but I was misled by another error. breaking down the import I still get the zip error: itut_ir = pd.read_excel('C:\\Users\\Administrator\\Downloads\\reportdata.xlsx') – Ele Aug 26 '20 at 12:09
  • Is reportdata.xlsx the original file? Regarding [this question](https://stackoverflow.com/questions/33873423/xlsx-and-xlsm-files-return-badzipfile-file-is-not-a-zip-file) opening and saving the file may help. – Andre S. Aug 26 '20 at 12:21
  • hi, the file is being downloaded from an FTP server. I can open it via excel. not sure what you mean with open and save thanks – Ele Aug 26 '20 at 12:24
  • ps: the file is not password protected – Ele Aug 26 '20 at 12:27
  • ahh it looks like that the file gets broken while downloading from FTP- copying the file manually works! – Ele Aug 26 '20 at 12:29
  • 1
    `xlsx` is basically a zip file, so it's likely that the file is broken when downloaded. – Quang Hoang Aug 26 '20 at 12:55
  • yeah, that was the case! thanks – Ele Aug 27 '20 at 08:34

1 Answers1

2

I had the same issue just a little bit ago with an XLSX that I created in LibreOffice.

The solution was to check the XLSX to make sure it wasn't corrupted. In my case, loading a previous version of the XLSX file corrected the problem.

Xenoranger
  • 421
  • 5
  • 22