Below are given various options on how to convert an uploaded file to FastAPI into a Pandas DataFrame. If you would also like to convert the DataFrame into JSON and return it to the client, have a look at this answer. If you would like to use an async def
endpoint instead of def
, please have a look at this answer on how to read the file contents in an async
way, as well as this answer to understand the difference between using def
and async def
. It would also be best to enclose the I/O operations (in the examples below) in a try-except-finally
block (as shown here and here), so that you can catch/raise any possible exceptions and close
the file
properly, in order to release the object from memory and avoid potential errors.
Related answers on how to upload and read a CSV file can be found here (gives examples using Jinja2 Templates), as well as here (converts the uploaded CSV file into JSON and returns it to the client) and here (provides solutions without using external libraries).
Option 1
Since pandas.read_csv()
can accept a file-like
object, you can pass the file-like
object of UploadFile
directly. UploadFile
exposes an actual Python SpooledTemporaryFile
that you can get using the .file
attribute. Example is given below. Note: The pd.read_csv()
isn't an async
method, and hence, if you are about to use async def
endpoint, it would be better to read the contents of the file using an async
method, as described here, and then pass the contents to pd.read_csv()
using one of the reamining options below. Alternatively, you can use Starlette's run_in_threadpool()
(as described here), which will run the pd.read_csv(file.file)
in a separate thread to ensure that the main thread (where coroutines are run) does not get blocked.
from fastapi import FastAPI, File, UploadFile
import pandas as pd
app = FastAPI()
@app.post("/upload")
def upload_file(file: UploadFile = File(...)):
df = pd.read_csv(file.file)
file.file.close()
return {"filename": file.filename}
Option 2
Convert the bytes into a string and then load it into an in-memory text buffer (i.e., StringIO
), which can be converted into a dataframe:
from fastapi import FastAPI, File, UploadFile
import pandas as pd
from io import StringIO
app = FastAPI()
@app.post("/upload")
def upload_file(file: UploadFile = File(...)):
contents = file.file.read()
s = str(contents,'utf-8')
data = StringIO(s)
df = pd.read_csv(data)
data.close()
file.file.close()
return {"filename": file.filename}
Option 3
Use an in-memory bytes buffer instead (i.e., BytesIO
), thus saving you the step of converting the bytes into a string as shown in Option 2:
from fastapi import FastAPI, File, UploadFile
import pandas as pd
from io import BytesIO
import uvicorn
app = FastAPI()
@app.post("/upload")
def upload_file(file: UploadFile = File(...)):
contents = file.file.read()
data = BytesIO(contents)
df = pd.read_csv(data)
data.close()
file.file.close()
return {"filename": file.filename}