86

What exactly does passing threaded = True to app.run() do?

My application processes input from the user, and takes a bit of time to do so. During this time, the application is unable to handle other requests. I have tested my application with threaded=True and it allows me to handle multiple requests concurrently.

davidism
  • 121,510
  • 29
  • 395
  • 339
Harrison
  • 5,095
  • 7
  • 40
  • 60

2 Answers2

114

As of Flask 1.0, the WSGI server included with Flask is run in threaded mode by default.

Prior to 1.0, or if you disable threading, the server is run in single-threaded mode, and can only handle one request at a time. Any parallel requests will have to wait until they can be handled, which can lead to issues if you tried to contact your own server from a request.

With threaded=True requests are each handled in a new thread. How many threads your server can handle concurrently depends entirely on your OS and what limits it sets on the number of threads per process. The implementation uses the SocketServer.ThreadingMixIn class, which sets no limits to the number of threads it can spin up.

Note that the Flask server is designed for development only. It is not a production-ready server. Don't rely on it to run your site on the wider web. Use a proper WSGI server (like gunicorn or uWSGI) instead.

Ry-
  • 218,210
  • 55
  • 464
  • 476
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • 3
    The only people that will be using my application are a select few people in my office. Is it okay to keep it in this state? – Harrison Aug 10 '16 at 14:58
  • 8
    @Harrison: then that's fine, unless those people are liable to try and hack or DDOS your machine. – Martijn Pieters Aug 10 '16 at 15:06
  • 1
    I can fully trust them. The likelihood that multiple people will use the application at the same time is relatively slim, so I think for now I will just keep it running on the Flask server. At what point do you think it would be a good decision to deploy using gunicorn? – Harrison Aug 10 '16 at 15:08
  • 3
    @Harrison: the moment you want to open it to the wider web, or you feel you need better control over how much resources you want the server to use. A dedicated WSGI server can control the amount of concurrency, as well as use multiple processes to distribute the load. – Martijn Pieters Aug 10 '16 at 15:10
  • uwsgi handles threading ? Do I also have to tell app.run to be threaded=True ? or only uwsgi will know to run threaded ? – Beqa Bukhradze Jun 11 '18 at 05:57
  • @BeqaBukhradze: see https://uwsgi-docs.readthedocs.io/en/latest/WSGIquickstart.html#adding-concurrency-and-monitoring – Martijn Pieters Jun 11 '18 at 12:02
  • Anyone knows if these production web servers (WSGI, gunicorn ..) are needed if I am already running my Flask app (on development web server) in Kubernetes? I control the resources and replicas of my Flask app pods on a k8s level through Kubernetes Deployment. There is the default round-robin load balancing of Kubernetes Services. In my experience it seems _stable enough_. Would a WSGI/nginx Pod help and how? – cryanbhu Sep 21 '19 at 10:14
  • 1
    @cryanbhu the point is that a malicious agent can exploit weaknesses in the Flask (werkzeug) HTTP implementation. The roundrobin forwarder is unlikely to protect you from this. Production quality WSGI servers do, they aim to be robust and secure. – Martijn Pieters Sep 21 '19 at 10:24
  • @MartijnPieters thanks.. so this means I should have Flask app as a container in my K8s Pod as well as WSGI server as another container in the same Pod I suppose.. What about nginx and Gunicorn? – cryanbhu Sep 29 '19 at 06:16
  • 1
    @cryanbhu: I'm sorry, I don't have enough context and experience with K8s to give advice here. – Martijn Pieters Sep 29 '19 at 11:39
  • So, unlike IIS server that spins up a process for each request, Flask always has only 1 process but multiple threads. Did I understand that right? – variable Nov 01 '19 at 05:36
  • 2
    @variable: no, that's not correct. Flask is a [WSGI framework](https://en.wikipedia.org/wiki/Web_Server_Gateway_Interface), and it is up to the WSGI server to determine how concurrency is handled. Flask does come with a [development server](https://flask.palletsprojects.com/en/1.1.x/cli/#run-the-development-server) for convenience, which uses a single process and threads,but that's not the only option and you really want to use a [proper production-level deployment](https://flask.palletsprojects.com/en/1.1.x/deploying/#deployment) when anywhere else. – Martijn Pieters Nov 01 '19 at 10:45
10

How many requests will my application be able to handle concurrently with this statement?

This depends drastically on your application. Each new request will have a thread launched- it depends on how many threads your machine can handle. I don't see an option to limit the number of threads (like uwsgi offers in a production deployment).

What are the downsides to using this? If i'm not expecting more than a few requests concurrently, can I just continue to use this?

Switching from a single thread to multi-threaded can lead to concurrency bugs... if you use this be careful about how you handle global objects (see the g object in the documentation!) and state.

Alex
  • 9,313
  • 1
  • 39
  • 44
Paul Becotte
  • 9,767
  • 3
  • 34
  • 42
  • Okay, thanks. You definitely cleared this up for me. By my question asking how many requests it can handle I just was wondering like, does `thread = true` just allow a hard coded `x` amount of requests to be handled concurrently. So it's determined by my machine? – Harrison Aug 10 '16 at 14:57
  • 1
    I never use the dev server, so my answer is not definitive, however, it does not appear to have any limit set... so infinite (depending on system resources). I run my apps using uWSGI, which DOES have a configurable thread limit. – Paul Becotte Aug 10 '16 at 17:26