8

I am looking to parse an excel spreadsheet. I decided to use pandas but got caught by an error straight off the bat.

enter image description here

I started with the code below but played around with using a full path and also tried setting the sheetname.

import pandas as pd

table = pd.read_excel('ss_12.xlsx')

if __name__ == '__main__':
    pass

The excel spreadsheet is in the same directory as my script file. I taught it would work the same as open() in this sense, just a name required if its in the same directory. I have looked at a few examples online and going by them this should work.

enter image description here

I am trying to strip the first column seen in the image above. The full error (not sure how to format it, sorry)

C:\xx\Playpen\ConfigList_V1_0.xlsx
Traceback (most recent call last):
  File "C:\xx\Playpen\getConVars.py", line 12, in <module>
    pd.read_excel(excelFile)
  File "C:\xx\Programs\Python\Python35\lib\site-packages\pandas\io\excel.py", line 200, in read_excel
    **kwds)
  File "C:\xx\Programs\Python\Python35\lib\site-packages\pandas\io\excel.py", line 432, in _parse_excel
    sheet = self.book.sheet_by_index(asheetname)
  File "C:\xx\Programs\Python\Python35\lib\site-packages\xlrd\book.py", line 432, in sheet_by_index
    return self._sheet_list[sheetx] or self.get_sheet(sheetx)
IndexError: list index out of range
Noah
  • 21,451
  • 8
  • 63
  • 71
10SecTom
  • 2,484
  • 4
  • 22
  • 26
  • Can you post the full code you're using and the full error traceback? The image above is incomplete and I have no real idea what the problem is. – Noah Apr 13 '17 at 16:10
  • @noah this, I am afraid is the full code. I just started playing with it and this error happened with basic test attempts. Basically I tried loading my excel spread sheet. I put it in the same folder as the script and tried what you see above. I also tried using a full path e.g. pd.read_excel("C:\\....\\my_spread_sheet") and then i tried a named sheet e.g. pd.read_excel("C:\\....\\my_spread_sheet", sheetname='sheet1.xlsx') – 10SecTom Apr 14 '17 at 11:09
  • 1
    Can you share the excel document? This seems like it might be a bug in pandas or the excel library, xlrd, that it's using. You might try submitting a bug report on the pandas github including an excel doc that causes the error. – Noah Apr 14 '17 at 14:56
  • @noah, I would rather not share the full document as its work related. I will look at sending a bug report and possibly share it through that medium.For now, I think I will use a different module other than pandas. Thanks for the help – 10SecTom Apr 18 '17 at 07:52
  • openpyxl also had a problem with the same document. Copied the contents to new document and i cna now open it up. I will add more information today. The error return from openpyxl seems more informative. – 10SecTom Apr 18 '17 at 08:08
  • I had the same issue. The file had been accidentally saved as a Strict XML Open Spreadsheet. – Michael A Kamboures May 06 '20 at 16:45

3 Answers3

31

Make sure you have the right kind of Excel spreadsheet. I had this same error and realized that I had saved it as a Strict XML Open Spreadsheet which still had the .xlsx extension.

ZMan
  • 334
  • 3
  • 4
1

If you just want to read the file, it's better to use os.path as follows:

import os
import pandas as pd

dir = 'path_to_excel_file_directory'
excelFile = os.path.join(dir, 'fileName.xlsx')
pd.read_excel(excelFile)

And if the excel file is in the same directory as your script, you can use inspect to automatically detect the directory it's in:

scriptName = inspect.getframeinfo(inspect.currentframe()).filename
dir = os.path.dirname(os.path.abspath(filename))
excelFile = os.path.join(dir, 'fileName.xlsx')
pd.read_excel(excelFile)

One final note: the part

if __name__ == '__main__':
    pass

is not related to the question.

splinter
  • 3,727
  • 8
  • 37
  • 82
  • which lines gives exactly what error? Can you add some of your data in the question? – splinter Apr 13 '17 at 16:18
  • pd.read_excel(excelFile) is still causing the error. I mentioned in the question i had tried a direct path but i just did pd.read_excel("C:\\....\\my_spread_sheet") and got the same error i have added to the question. – 10SecTom Apr 14 '17 at 11:04
1

Quick solution: create a brand new file (.xls) with the content of the one that gives you the error. It worked for me.

Luca1988
  • 11
  • 2