I'm using Azure Databricks and trying to read an excel file. I have an encrypted file with .xlsx.pgp
. After decrypting the message I get it as a byte array. So, here's the function I use to read this file as a pandas dataframe:
df = pd.read_excel(BytesIO(orig))
However, this is giving me the following error:
XLRDError: Excel xlsx file; not supported
Now, based on this documentation:
I have added openpyxl to the cluster and then tried to run the following:
df = pd.read_excel(BytesIO(orig),engine=`openpyxl`)
I'm getting the error:
global name 'openpyxl' is not defined
With the following command, I get:
df = pd.read_excel(BytesIO(orig),engine='openpyxl')
The error I get is:
ValueError: Unknown engine: openpyxl
How can I resolve this issue?
Thanks for all the help!