1

I'm searching for a method to decompress/unzip .xls files in python. By opening Excel files with 7-Zip you can see the directories I'd like to extract.

I already tried renaming the Excel to ".zip" and then extracting it

myExcelFile = zipfile.ZipFile("myExcel.zip") 
myExcelFile.extractall()

but it throws

zipfile.BadZipFile: File is not a zip file

.xls in 7-Zip

1 Answers1

2

.xls files use the BIFF format. .xlsx files use Office Open XML, which is a zipped XML format. BIFF is not a zipped format; files using that format are not recognized by zip libraries. – shmee

a conversion to .xlsx is the solution

import win32com.client as win32
fname = "full+path+to+xls_file"
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Open(fname)

wb.SaveAs(fname+"x", FileFormat = 51)    #FileFormat = 51 is for .xlsx extension
wb.Close()                               #FileFormat = 56 is for .xls extension
excel.Application.Quit()