I am trying to make a python program that downloads and XLS file from a website, in this case website is: https://www.blackrock.com/uk/individual/products/291392/, and loads it as a dataframe in pandas, with the correct data structure.
The issue is that when I try to load it via pandas, it gives me an error: XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\xef\xbb\xbf\xef\xbb\xbf<?'
I am not quite sure what is causing this error, but presumable something with the file. I can open the file in Excel, even though I get a warning that the file and the file extension do not match, and that the file might be dangerous etc. If I click yes to opening it anyway, it opens up with data displayed correctly. If I use Excel to save the file as .xlsx i can open it in pandas, but I would rather a solution that didn't require manually opening Excel and saving the file.
I have tried renaming the file extension to xlsx, but this does not work, as it won't allow me to open the file with that extension. I have tried many different extension, but non of them bite - unfortunately.
I am at a loss.
I hope, you can help.
EDIT: The code I use is:
download_path = 'https://www.blackrock.com/uk/individual/products/291392/fund/1527484370694.ajax?fileType=xls&fileName=iShares-MSCI-World-SRI-UCITS-ETF-USD-Dist_fund&dataType=fund'
testing = pd.read_excel(download_path, engine='xlrd', sheet_name = 'Holdings', skiprows = 3)