23

I can pass a StringIO object to pd.to_csv() just fine:

io = StringIO.StringIO()
pd.DataFrame().to_csv(io)

But when using the excel writer, I am having a lot more trouble.

io = StringIO.StringIO()
writer = pd.ExcelWriter(io)
pd.DataFrame().to_excel(writer,"sheet name")
writer.save()   

Returns an

AttributeError: StringIO instance has no attribute 'rfind'

I'm trying to create an ExcelWriter object without calling pd.ExcelWriter() but am having some trouble. This is what I've tried so far:

from xlsxwriter.workbook import Workbook
writer = Workbook(io)
pd.DataFrame().to_excel(writer,"sheet name")
writer.save()

But now I am getting an AttributeError: 'Workbook' object has no attribute 'write_cells'

How can I save a pandas dataframe in excel format to a StringIO object?

A User
  • 812
  • 2
  • 7
  • 21
  • 1
    I'm not sure you can, at least not easily. The argument to `to_excel` is a *path* to an Excel file, not an actual file object. Why do you want to create an in-memory representation of an Excel file anyway? – BrenBarn Jan 21 '15 at 02:52
  • 2
    Using Flask to make a downloadable report. – A User Jan 21 '15 at 16:44
  • 1
    In Python 3 you should be using `io.BytesIO`, as the output of writing an Excel file is a series of bytes, not a (unicode) string. – LeoRochael Nov 01 '18 at 20:43

4 Answers4

44

Pandas expects a filename path to the ExcelWriter constructors although each of the writer engines support StringIO. Perhaps that should be raised as a bug/feature request in Pandas.

In the meantime here is a workaround example using the Pandas xlsxwriter engine:

import pandas as pd
import StringIO

io = StringIO.StringIO()

# Use a temp filename to keep pandas happy.
writer = pd.ExcelWriter('temp.xlsx', engine='xlsxwriter')

# Set the filename/file handle in the xlsxwriter.workbook object.
writer.book.filename = io

# Write the data frame to the StringIO object.
pd.DataFrame().to_excel(writer, sheet_name='Sheet1')
writer.save()
xlsx_data = io.getvalue()

Update: As of Pandas 0.17 it is now possible to do this more directly:

# Note, Python 2 example. For Python 3 use: output = io.BytesIO().
output = StringIO.StringIO()

# Use the StringIO object as the filehandle.
writer = pd.ExcelWriter(output, engine='xlsxwriter')

And if you need to use the output outside of Pandas (for example in Django or Flask) remember to rewind the writer: output.seek(0).

See also Saving the Dataframe output to a string in the XlsxWriter docs.

jmcnamara
  • 38,196
  • 6
  • 90
  • 108
  • Thank you -- the one line workaround worked perfectly! – A User Jan 22 '15 at 01:13
  • 5
    This was just added in pandas, see here: https://github.com/pydata/pandas/pull/10376. Will be in the 0.17.0 release (prob end of july) – Jeff Jun 25 '15 at 19:34
  • 2
    For me it helped a lot - but one crucial thing was missing when giving the output to Flask: output.seek(0) – Fips Jul 14 '21 at 09:56
11

None of these were working from me. I had a view I wanted to return an excel workbook from in Django. I found my solution from the pandas documentation.

import io
bio = io.BytesIO()
writer = pd.ExcelWriter(bio, engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()
bio.seek(0)

# BONUS CONTENT
# .. because I wanted to return from an api
response = HttpResponse(bio, content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=myfile.xlsx'
return response # returned from a view here

Note, I used that value for content type because it was the mime type according to the mozzilla docs. From ".xlsx" in the following link. Replace as needed. https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types/Common_types

Nick Brady
  • 6,084
  • 1
  • 46
  • 71
  • Thanks! This just worked perfectly. I had to use `writer.close()` instead of `writer.save()` for newer versions of the module. – andzep Sep 02 '23 at 12:19
6

Glancing at the pandas.io.excel source looks like it shouldn't be too much of a problem if you don't mind using xlwt as your writer. The other engines may not be all that difficult either but xlwt jumps out as easy since its save method takes a stream or a filepath.

You need to initially pass in a filename just to make pandas happy as it checks the filename extension against the engine to make sure it's a supported format. But in the case of the xlwt engine, it just stuffs the filename into the object's path attribute and then uses it in the save method. If you change the path attribute to your stream, it'll happily save to that stream when you call the save method.

Here's an example:

import pandas as pd
import StringIO
import base64

df = pd.DataFrame.from_csv('http://moz.com/top500/domains/csv')
xlwt_writer = pd.io.excel.get_writer('xlwt')
my_writer = xlwt_writer('whatever.xls')  #make pandas happy 
xl_out = StringIO.StringIO()
my_writer.path = xl_out  
df.to_excel(my_writer)
my_writer.save()
print base64.b64encode(xl_out.getvalue())

That's the quick, easy and slightly dirty way to do it. BTW... a cleaner way to do it is to subclass ExcelWriter (or one of it's existing subclasses, e.g. _XlwtWriter) -- but honestly there's so little involved in updating the path attribute, I voted to show you the easy way rather than go the slightly longer route.

clockwatcher
  • 3,193
  • 13
  • 13
3

For those not using xlsxwriter as their engine= for to_excel here is a solution to use openpyxl in memory:

in_memory_file = StringIO.StringIO()
xlw = pd.ExcelWriter('temp.xlsx', engine='openpyxl')
# ... do many .to_excel() thingies
xlw.book.save(in_memory_file)
# if you want to read it or stream to a client, don't forget this
in_memory_file.seek(0)

explanation: the ExcelWriter wrapper class exposes the engines individual workbook through the .book property. For openpyxl you can then use the Workbook.save method as usual!

Deano
  • 1,136
  • 10
  • 19