8

I'm trying to use pandas to parse an .xlsm document. My code worked perfectly with the example file I was given, but once I got the rest of the documents, it failed with the above error. Here's the offending stack trace:

Traceback (most recent call last):
  File "@@@@@@@@/UnsupervisedCAM.py", line 9, in <module>
    info_dict = read_excel_to_dict('files/' + filename)
  File "@@@@@@@@\readCAM.py", line 7, in read_excel_to_dict
    df = pandas.read_excel(filename, parse_cols='E,G,I,K,Q,O')
  File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\pandas\io\excel.py", line 191, in read_excel
    io = ExcelFile(io, engine=engine)
  File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\pandas\io\excel.py", line 249, in __init__
    self.book = xlrd.open_workbook(io)
  File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\xlrd\__init__.py", line 441, in open_workbook
    ragged_rows=ragged_rows,
  File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\xlrd\book.py", line 87, in open_workbook_xls
    ragged_rows=ragged_rows,
  File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\xlrd\book.py", line 595, in biff2_8_load
    raise XLRDError("Can't find workbook in OLE2 compound document")
xlrd.biffh.XLRDError: Can't find workbook in OLE2 compound document

I'm not even sure where to start... Haven't found anything of use online.

bendl
  • 1,583
  • 1
  • 18
  • 41

3 Answers3

5

I got the same error message and could solve it by removing the password protection of the xlsx-file. (not saying that it's the only reason for the error, but worth checking!)

ivegotaquestion
  • 573
  • 1
  • 7
  • 19
1

After a lot of searching, the only way I've found to do this is to open and save all the excel documents, which seems to 'strip' them of their OLE2 format. I automated the process with the following vbs script:

Dim objFSO, objFolder, objFile
Dim objExcel, objWB
Set objExcel = CreateObject("Excel.Application")
Set objFSO = CreateObject("scripting.filesystemobject")
   MyFolder = "<PATH/TO/FILES"
Set objFolder = objfso.getfolder(myfolder)
For Each objFile In objfolder.Files
If Right(objFile.Name,4) = "<EXTENSION>" Then
Set objWB = objExcel.Workbooks.Open(objFile)
objWB.save
objWB.close
End If
Next
objExcel.Quit
Set objExcel = Nothing
Set objFSO = Nothing
Wscript.Echo "Done"

Make sure to change the path to the folder and extension.

bendl
  • 1,583
  • 1
  • 18
  • 41
0

In case you face this issue over Jupyter notebook as I did when searching for the error, you can simply restart the kernel and the issue gets resolved.