27

I'm using FastAPI and currently I return a csv which I read from SQL server with pandas. (pd.read_sql()) However the csv is quite big for the browser and I want to return it with a File response: https://fastapi.tiangolo.com/advanced/custom-response/ (end of the page). I cannot seem to do this without first writing it to a csv file which seems slow and will clutter the filesystem with csv's on every request.

So my questions way, is there way to return a FileResponse from a sql database or pandas dataframe.

And if not, is there a way to delete the generated csv files, after it has all been read by the client?

Thanks for your help!

Kind regards,

Stephan

SDuma
  • 303
  • 1
  • 3
  • 8
  • Does this answer your question? [How to stream DataFrame using FastAPI without saving the data to csv file?](https://stackoverflow.com/questions/73688641/how-to-stream-dataframe-using-fastapi-without-saving-the-data-to-csv-file) – Chris Dec 21 '22 at 05:04

3 Answers3

59

Based HEAVILY off this https://github.com/tiangolo/fastapi/issues/1277

  1. Turn your dataframe into a stream
  2. use a streaming response
  3. Modify headers so it's a download (optional)
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
import io
import pandas as pd

app = FastAPI()

@app.get("/get_csv")
async def get_csv():
    df = pd.DataFrame(dict(col1 = 1, col2 = 2), index=[0])
    stream = io.StringIO()
    df.to_csv(stream, index = False)
    response = StreamingResponse(iter([stream.getvalue()]),
                                 media_type="text/csv"
                                )
    response.headers["Content-Disposition"] = "attachment; filename=export.csv"
    return response
wavingtide
  • 1,032
  • 4
  • 19
Tom Greenwood
  • 1,502
  • 14
  • 17
8

Adding to the code that was previously mentioned, I found it useful to place another response header, in order for the client to be able to see the "Content-Disposition". This is due to the fact, that only CORS-safelisted response headers can be seen by default by the client. "Content-Disposition" is not part of this list, so it must be added explicitly https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Access-Control-Expose-Headers.

I don't know if there is another way to specify this, for the client or server in a more general way so that it applies to all the necessary endpoints, but this is the way I applied it.

@router.post("/files", response_class = StreamingResponse)
async def anonymization(file: bytes = File(...), config: str = Form(...)):
    # file as str
    inputFileAsStr = StringIO(str(file,'utf-8'))
    # dataframe
    df = pd.read_csv(inputFileAsStr)
    # send to function to handle anonymization
    results_df = anonymize(df, config)
    # output file
    outFileAsStr = StringIO()
    results_df.to_csv(outFileAsStr, index = False)
    response = StreamingResponse(
        iter([outFileAsStr.getvalue()]),
        media_type='text/csv',
        headers={
            'Content-Disposition': 'attachment;filename=dataset.csv',
            'Access-Control-Expose-Headers': 'Content-Disposition'
        }
    )
    # return
    return response
dmm98
  • 101
  • 1
  • 3
5

I was beating my head against the wall on this one as well. My use case is slightly different as I am storing images, pdfs, etc. as blobs in my maria database. I found that the trick was to pass the blob contents to BytesIO and the rest was simple.

from fastapi.responses import StreamingResponse
from io import BytesIO

@router.get('/attachment/{id}')
async def get_attachment(id: int):
    mdb = messages(s.MARIADB)

    attachment = mdb.getAttachment(id)
    memfile = BytesIO(attachment['content'])
    response = StreamingResponse(memfile, media_type=attachment['contentType'])
    response.headers["Content-Disposition"] = f"inline; filename={attachment['name']}"

    return response
David W.
  • 365
  • 5
  • 5