4

I have a short program that collects a list of all .xls files in folder/sub-folders and then I loop through the file list, opening each xls document (Try: book = xlrd.open_workbook(f) ) to look for specific information. If an exception is thrown, I write out the filename to an exception list. What I am finding is that I have a LOT of files that xlrd throws the error :

Traceback (most recent call last):
  File "<pyshell#5>", line 1, in <module>
    book = xlrd.open_workbook(f)
  File "C:\Python32\lib\site-packages\xlrd\__init__.py", line 435, in open_workbook
    ragged_rows=ragged_rows,
  File "C:\Python32\lib\site-packages\xlrd\book.py", line 116, in open_workbook_xls
    bk.parse_globals()
  File "C:\Python32\lib\site-packages\xlrd\book.py", line 1206, in parse_globals
    self.handle_filepass(data)
  File "C:\Python32\lib\site-packages\xlrd\book.py", line 924, in handle_filepass
    raise XLRDError("Workbook is encrypted")
xlrd.biffh.XLRDError: Workbook is encrypted

But I can go and open the files with Excel with no problems. Does anyone have an idea as to why xlrd would throw an encrypted error when it doesn't appear that the files are encrypted?

Thanks,

Fred

FredG
  • 303
  • 1
  • 4
  • 9
  • 4
    Are there (or have there ever been) any sheets that are "protected"? When a sheet is marked as protected, then internally Excel will encrypt the workbook (with the fixed password "VelvetSweatshop") in order to obfuscate the protected sheets. When you open an encrypted workbook, Excel silently tries the password "VelvetSweatshop" first so it doesn't look encrypted to the user. – zindorsky Apr 01 '14 at 16:53

2 Answers2

3

I ran into this same problem, and as @zindorsky has mentioned in their comment, this can happen when the file has a protected sheet - or via some other reason where Excel has silently encrypted the file with the magic password VelvetSweatshop.

XLRD can't handle files with encryption on its own (in fact the README lists it as 'unlikely to be done'), but there is another recent Python library that can unencrypt a variety of MS Office files (including .xls files) - msoffcrypto-tool .

I was able to use it to workaround the issue successfully - here's an abbreviated (and untested!) snippet version of the code

import xlrd
import msoffcrypto

def handle_protected_workbook(wb_filepath):
    try:
        _book = xlrd.open_workbook(wb_filepath)
    except xlrd.biffh.XLRDError, e:
        if e.message == "Workbook is encrypted":
            # Try and unencrypt workbook with magic password
            wb_msoffcrypto_file = msoffcrypto.OfficeFile(open(wb_filepath, 'rb'))
            try:
                # Yes, this is actually a thing
                # https://nakedsecurity.sophos.com/2013/04/11/password-excel-velvet-sweatshop/
                wb_msoffcrypto_file.load_key(password='VelvetSweatshop')
            except AssertionError, e:
                if e.message == "Failed to verify password":
                    # Encrypted with some other password
                    raise # or do something else
                else:
                    # Some other error occurred
                    raise
            except:
                # Some other error occurred
                raise
            else:
                # Magic Excel password worked

                assert wb_filepath.endswith('.xls')
                wb_unencrypted_filename = wb_filepath[:-(len('.xls'))] + '__unencrypted.xls'

                with tempfile.NamedTemporaryFile() as tmp_wb_unencrypted_file:
                    # Decrypt into the tempfile
                    wb_msoffcrypto_file.decrypt(tmp_wb_unencrypted_file)
                    # --- Do something with the file ---
                # return true to indicate file was touched
                return True  # or do something else
        else:
            # some other xlrd error occurred.
            return False  # or do something else
    except:
        # some non-xlrd error occurred.
        return False  # or do something else
nuclearpidgeon
  • 341
  • 3
  • 8
1

Building on nuclearpidgeon's answer, I created a context manager that transparently decrypts files as needed and takes care of clean-up.

Usage is:

with handle_protected_workbook(filepath) as wb:
   # use wb
   ...

https://gist.github.com/terrdavis/b219e92d42dc5f9ca526aa0047d1a1d1

Terry Davis
  • 511
  • 5
  • 8