Digging into the source code, I found that FastAPI is throws the HTTP exception with status code 400 and detail There was an error in parsing body
exactly once in the source code, when it is trying to figure out if the request form or body needs to be read. The FastAPI Request is basically the Starlette Request, so I reimplemented the FastAPI server application as a Starlette application hoping it would bypass this exception handler and give me more information about this issue.
main.py
from starlette.applications import Starlette
from starlette.responses import JSONResponse
from starlette.routing import Route
async def homepage(request):
return JSONResponse({'hello': 'world'})
async def upload(request):
form = await request.form()
print(type(form['upload_file']))
filename = form['upload_file'].filename or 'not found'
contents = await form['upload_file'].read()
b = len(contents) or -1
return JSONResponse({
'filename': filename,
'bytes': b
})
app = Starlette(debug=True, routes=[
Route('/', homepage),
Route('/api', upload, methods=['POST'])
])
Pipfile
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"
[packages]
starlette = "*"
uvicorn = "*"
uvloop = "*"
httpx = "*"
watchgod = "*"
python-multipart = "*"
[dev-packages]
[requires]
python_version = "3.9"
On posting a file of size 989 MiB or larger, the Starlette application throws OS error 28, no space left on device. A file of size 988 MiB or less, caused no error.
INFO: 10.0.2.2:46996 - "POST /api HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/uvicorn/protocols/http/httptools_impl.py", line 398, in run_asgi
result = await app(self.scope, self.receive, self.send)
File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
return await self.app(scope, receive, send)
File "/usr/local/lib/python3.9/site-packages/starlette/applications.py", line 112, in __call__
await self.middleware_stack(scope, receive, send)
File "/usr/local/lib/python3.9/site-packages/starlette/middleware/errors.py", line 181, in __call__
raise exc from None
File "/usr/local/lib/python3.9/site-packages/starlette/middleware/errors.py", line 159, in __call__
await self.app(scope, receive, _send)
File "/usr/local/lib/python3.9/site-packages/starlette/exceptions.py", line 82, in __call__
raise exc from None
File "/usr/local/lib/python3.9/site-packages/starlette/exceptions.py", line 71, in __call__
await self.app(scope, receive, sender)
File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 580, in __call__
await route.handle(scope, receive, send)
File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 241, in handle
await self.app(scope, receive, send)
File "/usr/local/lib/python3.9/site-packages/starlette/routing.py", line 52, in app
response = await func(request)
File "/home/vagrant/star-file-server/./main.py", line 11, in upload
form = await request.form()
File "/usr/local/lib/python3.9/site-packages/starlette/requests.py", line 240, in form
self._form = await multipart_parser.parse()
File "/usr/local/lib/python3.9/site-packages/starlette/formparsers.py", line 231, in parse
await file.write(message_bytes)
File "/usr/local/lib/python3.9/site-packages/starlette/datastructures.py", line 445, in write
await run_in_threadpool(self.file.write, data)
File "/usr/local/lib/python3.9/site-packages/starlette/concurrency.py", line 40, in run_in_threadpool
return await loop.run_in_executor(None, func, *args)
File "/usr/lib64/python3.9/concurrent/futures/thread.py", line 52, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/lib64/python3.9/tempfile.py", line 755, in write
rv = file.write(s)
OSError: [Errno 28] No space left on device
Starlette's UploadFile data structure uses a SpooledTemporaryFile. This object writes to your os's temporary directory. My temporary directory is /tmp
because I'm on Fedora 34, and I have not created any environment variables to tell python to use anything else as a temporary directory.
[vagrant@fedora star-file-server]$ python
Python 3.9.5 (default, May 14 2021, 00:00:00)
[GCC 11.1.1 20210428 (Red Hat 11.1.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tempfile
>>> tempfile.gettempdir()
'/tmp'
[vagrant@fedora star-file-server]$ df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 974M 0 974M 0% /dev
tmpfs 989M 168K 989M 1% /dev/shm
tmpfs 396M 5.6M 390M 2% /run
/dev/sda1 40G 1.6G 36G 5% /
tmpfs 989M 0 989M 0% /tmp
tmpfs 198M 84K 198M 1% /run/user/1000
Starlette sets max_size
for the SpooledTemporaryDirectory
to 1 MiB. From the Python tempfile documentation, I think that means only 1 MiB can be read into memory at a time from the temporary file while it is being used. Although it is of by 1 MiB, 989 MiB appears to be the correct hard boundary on the UploadFile
size because SpooledTemporaryDirectory
is bound by the storage available to the system's temporary directory.
If I still want to use UploadFile
I can create an environment variable to point to a device that is known to always have enough space available, even for the largest uploads.
export TMPDIR=/huge_storage_device
The approach I prefer uses the request's stream
, to avoid having to write the file twice, first to a local temporary directory, and second to a local permanent directory.
import os, pathlib
import fastapi as fast
import aiofiles
app = fast.FastAPI()
@app.post('/stream')
async def stream(
request: fast.Request,
filename: str,
filedir: str = ''
):
dest = RESULTS_DIR.joinpath(filedir, filename)
dest.parent.mkdir(parents=True, exist_ok=True)
async with aiofiles.open(dest, 'wb') as buffer:
async for chunk in request.stream():
await buffer.write(chunk)
return {
'loc': f'localhost:7070/{dest.parent.name}/{dest.name}'
}
Using this approach, when I uploaded files (5M, 450M, 988M each with two repeated measures) to the server running on a Fedora vm with 2048 MiB memory, the server never used up too much memory, never crashed, and the average latency reduction was 40% (i.e. the latency of posting to /stream
was about 60% of latency posting to /api
).