Pandas open_excel() fails with xlrd.biffh.XLRDError: Can't find workbook in OLE2 compound document

Question

I'm trying to use pandas to parse an .xlsm document. My code worked perfectly with the example file I was given, but once I got the rest of the documents, it failed with the above error. Here's the offending stack trace:

Traceback (most recent call last):
  File "@@@@@@@@/UnsupervisedCAM.py", line 9, in <module>
    info_dict = read_excel_to_dict('files/' + filename)
  File "@@@@@@@@\readCAM.py", line 7, in read_excel_to_dict
    df = pandas.read_excel(filename, parse_cols='E,G,I,K,Q,O')
  File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\pandas\io\excel.py", line 191, in read_excel
    io = ExcelFile(io, engine=engine)
  File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\pandas\io\excel.py", line 249, in __init__
    self.book = xlrd.open_workbook(io)
  File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\xlrd\__init__.py", line 441, in open_workbook
    ragged_rows=ragged_rows,
  File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\xlrd\book.py", line 87, in open_workbook_xls
    ragged_rows=ragged_rows,
  File "@@@@@@@@\Anaconda3\envs\tensorflow\lib\site-packages\xlrd\book.py", line 595, in biff2_8_load
    raise XLRDError("Can't find workbook in OLE2 compound document")
xlrd.biffh.XLRDError: Can't find workbook in OLE2 compound document

I'm not even sure where to start... Haven't found anything of use online.

score 5 · Answer 1 · answered Nov 08 '18 at 15:46

5

I got the same error message and could solve it by removing the password protection of the xlsx-file. (not saying that it's the only reason for the error, but worth checking!)

answered Nov 08 '18 at 15:46

ivegotaquestion

573
1
7
19

1

This should be a comment, not an answer (from review) – FrankS101 Nov 08 '18 at 16:08

score 1 · Accepted Answer · answered Jun 14 '17 at 15:37

After a lot of searching, the only way I've found to do this is to open and save all the excel documents, which seems to 'strip' them of their OLE2 format. I automated the process with the following vbs script:

Dim objFSO, objFolder, objFile
Dim objExcel, objWB
Set objExcel = CreateObject("Excel.Application")
Set objFSO = CreateObject("scripting.filesystemobject")
   MyFolder = "<PATH/TO/FILES"
Set objFolder = objfso.getfolder(myfolder)
For Each objFile In objfolder.Files
If Right(objFile.Name,4) = "<EXTENSION>" Then
Set objWB = objExcel.Workbooks.Open(objFile)
objWB.save
objWB.close
End If
Next
objExcel.Quit
Set objExcel = Nothing
Set objFSO = Nothing
Wscript.Echo "Done"

Make sure to change the path to the folder and extension.

I had the same problem and couldn't understand why. The script is also super useful! — user1527152, Mar 12 '18 at 17:26

score 0 · Answer 3 · answered Jun 28 '18 at 06:36

0

In case you face this issue over Jupyter notebook as I did when searching for the error, you can simply restart the kernel and the issue gets resolved.

answered Jun 28 '18 at 06:36

Vishnuraj Rajendran

1
2

Pandas open_excel() fails with xlrd.biffh.XLRDError: Can't find workbook in OLE2 compound document

3 Answers3