I connected to SharePoint to my Cloud Environment Directory and downloaded it to my current directory.
file_url = """/sites/XXXXXX/Shared%20Documents/General/new_name/train_data.xlsx"""
response = File.open_binary(ctx, file_url)
response.raise_for_status()
import pathlib
ROOT_DIR = pathlib.Path().resolve()
download_file_path = ROOT_DIR / "download_test.xlsb"
response.raise_for_status() # 200 is the response code, so it downloaded successfully
with open(download_file_path, "wb") as local_file: # This is the part where I think needs to be changed
local_file.write(response.content)
I cannot open the downloaded file on Excel as it says it "format or extension is not valid".
I used the following to read the downloaded Excel file ending with .xlsb extension for analysis in Pandas:
import pandas as pd
pd.read_excel("download_test.xlsb", engine='pyxlsb')
I get the following error:
usr/lib/python3.6/zipfile.py in open(self, name, mode, pwd, force_zip64)
1373 else:
1374 # Get info object for name
-> 1375 zinfo = self.getinfo(name)
1376
1377 if mode == 'w':
/usr/lib/python3.6/zipfile.py in getinfo(self, name)
1302 if info is None:
1303 raise KeyError(
-> 1304 'There is no item named %r in the archive' % name)
1305
1306 return info
KeyError: "There is no item named 'xl/_rels/workbook.bin.rels' in the archive"
I already searched for similar questions Read XLSB File in Pandas Python Convert .xlsb to .xlsx - Multi-sheet Microsoft Excel File
Without success. Any help is very much appreciated.