We're running a Flask app exposing data stored in a database. It returns a lot of 503
errors. My understanding is that those are generated by apache when the maximum number of concurrent threads is reached.
The root cause is most probably the app performing poorly but at this stage, we can't afford much more development time, so I'm looking for a cheap deployment config hack to mitigate the issue.
Data providers are sending data at a high rate. I believe their program gets a lot of
503
and just try/catch those to retry until success.Data consumers use the app at a much lower rate and I'd like them not to be bothered so much by those issues.
I'm thinking of limiting the number of concurrent accesses from the IP of each provider. They may get a lower throughput but they'd live with it as they already do, and it would make life easier for casual consumers.
I identified the mod_limitipconn which seems to be taylored for this.
mod_limitipconn [...] allows administrators to limit the number of simultaneous requests permitted from a single IP address.
I'd like to be sure I understand how it works and how the limits are set.
I always figured there were a maximum of 5 simultaneous connection due to the WSGI settings: threads=5
. But I read Processes and Threading in mod_wsgi docs and I'm confused.
Considering the configuration below, are those assumptions correct?
Only one instance of the application is running at a time.
A maximum of 5 concurrent threads can be spawned.
When 5 requests are being treated, if a sixth request arrives, the client gets a
503
.Limiting the number of simultaneous requests for IP x.x.x.x. at apache level to 3 would ensure than only 3 of those 5 threads can be used by that IP, leaving 2 to the other IPs.
Raising the number of threads in WSGI config could help share the connection pool amongst clients by providing more granularity in the rate limits (you can limit to 3 for each of 4 providers and keep 5 more with a total of 17) but would not improve the overall performance, even if the server has idle cores, because the Python GIL prevents several threads to run at the same time.
Raising the number of threads to a high number like 100 may make the requests longer but would limit
503
responses. It might even be enough if the clients set their own concurrent requests limit not too high and if they don't, I can enforce that with something likemod_limitipconn
.Raising the number of threads too much would make the requests so long that the clients would get timeouts instead of
503
which is not really better.
Current config below. Not sure what matters.
apachectl -V
:
Server version: Apache/2.4.25 (Debian)
Server built: 2018-06-02T08:01:13
Server's Module Magic Number: 20120211:68
Server loaded: APR 1.5.2, APR-UTIL 1.5.4
Compiled using: APR 1.5.2, APR-UTIL 1.5.4
Architecture: 64-bit
Server MPM: event
threaded: yes (fixed thread count)
forked: yes (variable process count)
/etc/apache2/apache2.conf
:
# KeepAlive: Whether or not to allow persistent connections (more than
# one request per connection). Set to "Off" to deactivate.
#
KeepAlive On
#
# MaxKeepAliveRequests: The maximum number of requests to allow
# during a persistent connection. Set to 0 to allow an unlimited amount.
# We recommend you leave this number high, for maximum performance.
#
MaxKeepAliveRequests 100
/etc/apache2/mods-available/mpm_worker.conf
(but that shouldn't matter in event
more, right?):
<IfModule mpm_worker_module>
StartServers 2
MinSpareThreads 25
MaxSpareThreads 75
ThreadLimit 64
ThreadsPerChild 25
MaxRequestWorkers 150
MaxConnectionsPerChild 0
</IfModule>
/etc/apache2/sites-available/my_app.conf
:
WSGIDaemonProcess my_app threads=5