5

I'm trying to get an understanding of the best way of handling file uploads safely in a wsgi app. It seems a lot of solutions involve using FieldStorage from the cgi module to parse form data. From what I understand about FieldStorage it performs a bit of 'magic' behind the scenes by streaming data into a tempfile.

What I'm not 100% clear on is how to restrict a request containing a file greater than a specified amount (say 10MB). If someone uploads a file which is several GB in size you obviously want to block the request before it chews through your server's disk space right?

What is the best way to restrict file uploads in a wsgi application?

Will
  • 1,149
  • 12
  • 25
  • This question is tagged `wsgi`. Are you using the WSGI interface directly? If so, then it should be a piece of cake to control the amount of data you read from `environ['wsgi.input']` (and check the `Content-Length` header). – André Caron Jan 23 '12 at 17:55

2 Answers2

3

It would depend on your front-end server. If it has any configuration to block big request even before it goes into your app, use it.

If you want to block this with your code I see two approaches:

  • Look ate the Content-Length HTTP Header. If it's bigger than you can handle, deny the request right away.
  • Don't trust the headers and start reading the request body, until you reach your limit. Note that this is not a very clever way, but could work. =)

Trusting the HTTP header could lead you to some problems. Supose some one send a request with a Content-Length: 1024 but sends a 1GB request body. If your front-end server trusts the header, it will start do read this request and would find out later that the request body is actually much bigger that it should be. This situation could still fill your server disk, even being a request that "passes" the "too big check".

Although this could happen, I think trusting the Header would be a good start point.

Dalton Barreto
  • 601
  • 5
  • 3
  • 1
    Shouldn't the HTTP server handle the wrong `Content-Length` situation? I thought HTTP/1.1 compliant implementations have to refuse extra content. Besides, they have to do this if they want to implement HTTP keep alive (otherwise there would be no way to know when the next request starts). – André Caron Jan 23 '12 at 17:56
  • You're right! But anyway, could be tricky to find out when a request is actually bigger than the value at Content-Length header. The serve would have to read the request until it reaches Content-Length + 1 bytes, but maybe there is a smarter way to do this. – Dalton Barreto Jan 23 '12 at 18:00
  • The server is already reading the request since it needs to extract headers and forward the request body from the socket to the WSGI application. Also, the server can't claim to be HTTP/1.1. compliant if it doesn't do the check. – André Caron Jan 23 '12 at 18:11
0

You could use the features of the HTTP server you probably have in front of your WSGI application. For example lighttpd has many options for traffic shaping.