0

I have this excel file that I've downloaded from here and I want to read it into a pandas dataframe. If I open the file in Excel, it opens fine, no passwords asked, but if I try to modify it, it does ask for a password. If I try to open the file on pandas with the following code instead:

data_inctl = pd.read_excel(
    INCTL_DATA_PATH,
    sheet_name="Serie Histórica"
)

It gives me the following error:

XLRDError: Workbook is encrypted

I tried some of the solutions found here but I do not have a password for the file so none of them work.

The data itself is not encrypted since I can open it fine on excel. I can also copy and paste the data into another file from Excel without a problem, but that would not allow me to automate my ETL. Is there a way that I could get this data into a pandas dataframe, even though the file is protected from modification?

João Areias
  • 1,192
  • 11
  • 41
  • Is this link correct? I couldn't find the link for the spreadsheet, but only a link to an index report in PDF format at the bottom of the page. – Nilton Moura Mar 30 '22 at 13:39
  • Oh, I'm sorry, I grabbed from there a few days ago, looks like the link is broken now, I'll update in a bit – João Areias Mar 30 '22 at 13:43

1 Answers1

1

Managed to do it after reading this comment from @zyndorsky and finding this package:

>>> import pandas as pd
>>> import io
>>> import msoffcrypto
>>> protected_excel_filepath = '/tmp/INCTL_0222.xls'
>>> unprotected_excel_content = io.BytesIO()
>>> with open(protected_excel_filepath, 'rb') as f:
...     msoffcryptoobj = msoffcrypto.OfficeFile(f)
...     msoffcryptoobj.load_key(password="VelvetSweatshop")
...     unprotected_excel_content = io.BytesIO()
...     msoffcryptoobj.decrypt(unprotected_excel_content)
...
>>> data = pd.read_excel(unprotected_excel_content)
>>> print(data)
    Unnamed: 0  Unnamed: 1  ... Unnamed: 10  Unnamed: 11
0          NaN         NaN  ...         NaN          NaN
1          NaN         NaN  ...         NaN          NaN
2          NaN         NaN  ...       50 km     0.234338
3          NaN         NaN  ...      400 km     0.229290
4          NaN         NaN  ...      800 km     0.226535
5          NaN         NaN  ...    2.400 km     0.224885
6          NaN         NaN  ...    6.000 km     0.224641
7          NaN         NaN  ...         NaN          NaN
...
Nilton Moura
  • 313
  • 2
  • 8