1

So I have an .xls file which I am able to open with Excel and also with Notepad (can see the numbers along with some other text) but I cannot read the file using pandas module.

df = pd.read_excel(r'"R:\Project\Projects\429 - Buchner Höhe\Analysis Data\scada\20171101.xls"',parse_dates=[[0,1,2,3]]) 

The error which pops up is as follows:

XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'\x03\x11\x0b\x02 \x01\x00\x00'

I tried renaming the file to .xlsx using os.rename, it still does not work.

Martin Evans
  • 45,791
  • 17
  • 81
  • 97
Sowjanya
  • 13
  • 1
  • 6
  • Can you please post the code you wrote? You're more likely to get help if your question provides a [minimal, complete and verifiable example](https://stackoverflow.com/help/mcve). – VMatić Dec 07 '17 at 08:46
  • df=pd.read_excel(r'"R:\Project\Projects\429 - Buchner Höhe\Analysis Data\scada\20171101.xls"',parse_dates=[[0,1,2,3]]) – Sowjanya Dec 07 '17 at 08:48
  • MS Excel is able to open a XLS/XLSX file even it this last has few error. Maybe Pandas don't... – locobastos Dec 07 '17 at 08:48
  • Maybe duplicate with - https://stackoverflow.com/questions/16504975/error-unsupported-format-or-corrupt-file-expected-bof-record - https://stackoverflow.com/questions/9623029/python-xlrd-unsupported-format-or-corrupt-file - https://stackoverflow.com/questions/45700658/pandas-read-excel-unsupported-format-or-corrupt-file-expected-bof-record – locobastos Dec 07 '17 at 08:50
  • Yes i checked the post already. It did not help in solving. Therefore I posted here again – Sowjanya Dec 07 '17 at 08:52
  • And i am reading a .xls file here with pandas which uses xlrd module and this error arises – Sowjanya Dec 07 '17 at 08:54
  • Try installing the latest version of `xlrd`, e.g. `pip install xlrd --upgrade` – Martin Evans Dec 07 '17 at 09:55
  • Yes I did that still I have the same error – Sowjanya Dec 07 '17 at 10:23
  • It worked after i opened all the .xls files and saved them as .csv and then used pd.read_csv. Still the issue with reading .xls files remains unsolved – Sowjanya Dec 07 '17 at 13:33

1 Answers1

1

It is quite likely the file was already a csv file--not an xls or xlsx, renamed through the file system, rather than an actual Excel format file. This is the error generated when you attempt to open a csv with xlrd.

The indicator that this is the case is you can open it with Notepad.