7

I'm working with Django and need to read the sheets and cells of an uploaded xlsx file. It should be possible with xlrd but because the file has to stay in memory and may not be saved to a location I'm not sure how to continue.

The start point in this case is a web page with an upload input and a submit button. When submitted the file is caught with request.FILES['xlsx_file'].file and send to a processing class that would have to extract all the important data for further processing.

The type of request.FILES['xlsx_file'].file is BytesIO and xlrd is not able to read that type because of no getitem methode.

After converting the BytesIO to StringIO the error messages seems to stay the same '_io.StringIO' object has no attribute '__getitem__'

    file_enc = chardet.detect(xlsx_file.read(8))['encoding']
    xlsx_file.seek(0)

    sio = io.StringIO(xlsx_file.read().decode(encoding=file_enc, errors='replace'))
    workbook = xlrd.open_workbook(file_contents=sio)
dreftymac
  • 31,404
  • 26
  • 119
  • 182
Adrian Z.
  • 904
  • 1
  • 12
  • 29
  • Hi, could you give us any code that you're using to solve this issue? It would help in arriving at a solution. From your current description, I can only venture a guess that xlrd expects a file-like object so it should be possible to get the uploaded xlsx file, feed it into a `StringIO` object and pass it to xlrd perhaps. – Protagonist Apr 07 '16 at 09:58
  • There is not much code for this particular case but I'll try to give a bit more context. – Adrian Z. Apr 07 '16 at 10:00
  • So you think that a StringIO would be sufficient? – Adrian Z. Apr 07 '16 at 10:04
  • @Protagonist The result seems to stay the same – Adrian Z. Apr 07 '16 at 10:10
  • Are you using the `open_workbook` xlrd function to read the `request.FILES['xlsx_file'].file`? If yes, the function provides a keyword parameter just for this case: `open_workbook(file_contents=request.FILES['xlsx_file'].file)`. The `file_contents` should take file-like objects, mmap objects, or string objects according to the [docs](https://secure.simplistix.co.uk/svn/xlrd/trunk/xlrd/doc/xlrd.html?p=4966#__init__.open_workbook-function) – Protagonist Apr 07 '16 at 10:10
  • @Protagonist Yes I'm already using what you have suggested. – Adrian Z. Apr 07 '16 at 10:11
  • The result of your suggestion is `TypeError: '_io.BytesIO' object has no attribute '__getitem__'` – Adrian Z. Apr 07 '16 at 10:14
  • 1
    Ok, thanks for your pointers. I downloaded xlrd and tested it locally. It seems the best way to go here is to pass it a string ie. `open_workbook(file_contents=xlsx_file.read().decode(encoding=file_enc, errors='replace'))`. I misunderstood the docs, but I'm positive that `file_contents=` will work with a string. – Protagonist Apr 07 '16 at 10:19
  • Just reading seems to be good but how can I now test if I have the data? – Adrian Z. Apr 07 '16 at 10:23
  • 1
    For that you'd have to use xlrd's api to read a worbooks page/cells, and anything else and I'd put that in a separate question :). Also, please accept my answer if it solved the issue of opening the xlsx file without writing it to disk! – Protagonist Apr 07 '16 at 10:27

3 Answers3

7

Try xlrd.open_workbook(file_contents=request.FILES['xlsx_file'].read())

Sergey Gornostaev
  • 7,596
  • 3
  • 27
  • 39
7

I'm moving my comment into an answer of it's own. It related to the example code (which includes decoding) given in the updated question:

Ok, thanks for your pointers. I downloaded xlrd and tested it locally. It seems the best way to go here is to pass it a string ie. open_workbook(file_contents=xlsx_file.read().decode(encoding=file_enc, errors='replace')). I misunderstood the docs, but I'm positive that file_contents= will work with a string.

Protagonist
  • 492
  • 1
  • 6
  • 17
  • 1
    It seems that decoding is not needed in this case and that read just works because it's an ascii type. – Adrian Z. Apr 07 '16 at 10:30
  • 1
    I am still getting this error: `Exception Type: TypeError Exception Value: unsupported operand type(s) for <<: 'str' and 'int'` – Abhimanyu Aug 06 '19 at 12:40
0

I had a similar problem but in my case I needed to unit test a Djano app with download by the user of an xls file.

The basic code using StringIO worked for me.

class myTest(TestCase):
    def test_download(self):
        response = self.client('...')
        f = StringIO.StringIO(response.content)
        book = xlrd.open_workbook(file_contents = f.getvalue() )
        ...
        #unit-tests here
David
  • 117
  • 2
  • 14