We have a big EC2 instance with 32 cores, currently running Nginx, Tornado and Redis, serving on average 5K requests per second. Everything seems to work fine, but the CPU load already reaching 70% and we have to support even more requests. One of the thoughts was to replace Tornado with uWSGI because we don't really use async features of Tornado.
Our application consist from one function, it receives a JSON (~=4KB), doing some blocking but very fast stuff (Redis) and return JSON.
- Proxy HTTP request to one of the Tornado instances (Nginx)
- Parse HTTP request (Tornado)
- Read POST body string (stringified JSON) and convert it to python dictionary (Tornado)
- Take data out of Redis (blocking) located on same machine (py-redis with hiredis)
- Process the data (python3.4)
- Update Redis on same machine (py-redis with hiredis)
- Prepare stringified JSON for response (python3.4)
- Send response to proxy (Tornado)
- Send response to client (Nginx)
We thought the speed improvement will come from uwsgi protocol, we can install Nginx on separate server and proxy all requests to uWSGI with uwsgi protocol. But after trying all possible configurations and changing OS parameters we still can't get it working even on current load. Most of the time nginx log contains 499 and 502 errors. In some configurations it just stopped receiving new requests like it hit some OS limit.
So as I said, we have 32 cores, 60GB free memory and very fast network. We don't do heavy stuff, only very fast blocking operations. What is the best strategy in this case? Processes, Threads, Async? What OS parameters should be set?
Current configuration is:
[uwsgi]
master = 2
processes = 100
socket = /tmp/uwsgi.sock
wsgi-file = app.py
daemonize = /dev/null
pidfile = /tmp/uwsgi.pid
listen = 64000
stats = /tmp/stats.socket
cpu-affinity = 1
max-fd = 20000
memory-report = 1
gevent = 1000
thunder-lock = 1
threads = 100
post-buffering = 1
Nginx config:
user www-data;
worker_processes 10;
pid /run/nginx.pid;
events {
worker_connections 1024;
multi_accept on;
use epoll;
}
OS config:
sysctl net.core.somaxconn
net.core.somaxconn = 64000
I know the limits are too high, started to try every value possible.
UPDATE:
I ended up with the following configuration:
[uwsgi]
chdir = %d
master = 1
processes = %k
socket = /tmp/%c.sock
wsgi-file = app.py
lazy-apps = 1
touch-chain-reload = %dreload
virtualenv = %d.env
daemonize = /dev/null
pidfile = /tmp/%c.pid
listen = 40000
stats = /tmp/stats-%c.socket
cpu-affinity = 1
max-fd = 200000
memory-report = 1
post-buffering = 1
threads = 2