2

I am getting error on opening xlsx extension file in windows 8 using tablib library.

python version - 2.7.14

error is as follows:

python suit_simple_sheet_product.py
Traceback (most recent call last):
  File "suit_simple_sheet_product.py", line 19, in <module>
    data = tablib.Dataset().load(open(BASE_PATH).read())
  File "C:\Python27\lib\site-packages\tablib\core.py", line 446, in load
    format = detect_format(in_stream)
  File "C:\Python27\lib\site-packages\tablib\core.py", line 1157, in detect_format
    if fmt.detect(stream):
  File "C:\Python27\lib\site-packages\tablib\formats\_xls.py", line 25, in detect
    xlrd.open_workbook(file_contents=stream)
  File "C:\Python27\lib\site-packages\xlrd\__init__.py", line 120, in open_workbook
    zf = zipfile.ZipFile(timemachine.BYTES_IO(file_contents))
  File "C:\Python27\lib\zipfile.py", line 770, in __init__
    self._RealGetContents()
  File "C:\Python27\lib\zipfile.py", line 811, in _RealGetContents
    raise BadZipfile, "File is not a zip file"
zipfile.BadZipfile: File is not a zip file

path location is as follows = BASE_PATH = 'C:\Users\anju\Downloads\automate\catalog-5090 fabric detail and price list.xlsx'

Raja Bose
  • 344
  • 2
  • 8
  • You could try `open(BASE_PATH, 'rb')` to open the xlsx file in binary mode. – Martin Evans Mar 01 '18 at 08:24
  • can you please provide me some complete example of tablib. I have tried your tip but somehow the data i am getting after using it, does not give all rows data excepting column names of sheet. – Raja Bose Mar 01 '18 at 09:34
  • Your xlsx file might be using features that the underlying `xlrd` library does not support. You could use something like [pastebin](https://pastebin.com/) to upload your file to and paste the URL to it here. – Martin Evans Mar 01 '18 at 09:54

1 Answers1

3

Excel .xlsx files are actually zip files. In order for the unzip to work correctly, the file must be opened in binary mode, as such your need to open the file using:

import tablib

BASE_PATH = r'c:\my folder\my_test.xlsx'
data = tablib.Dataset().load(open(BASE_PATH, 'rb').read())

print data

Add r before your string to stop Python from trying to interpret the backslash characters in your path.

Martin Evans
  • 45,791
  • 17
  • 81
  • 97