3

I am hosting ruby application but it gives errors (some kind of error 500) while processing more than 400 requests per second. Test (loader.io) with lower amount of requests (below 400 request) finishes with good results. I think I could have better results while processing 500 requests/second and more.

The app is using t2.2xlarge ec2 instance (with 32Gb of memory, 8 virtual cores). I guess it might give better performance. The machine is running on Ubuntu 14.04, Rails 4.0.12, Nginx with Passenger.

I tried to make some changes in Nginx configuration but without any big progress. My current configuration:

passenger_max_pool_size 60;
#passenger_pool_idle_time 20;
server {
  listen 80;
  return 301 https://mydomain.eu$request_uri;
}

server {
  listen 443;
  server_name ~^(\w+)\.mydomain.eu$;
  return 301 https://mydomain.eu$request_uri;
}
server {
  listen 443 ssl spdy default;
  server_name mydomain.eu;
  passenger_enabled on;
  #passenger_max_pool_size 12;
  passenger_max_request_queue_size 2000;
  gzip on;

  root /home/ubuntu/application/cversion/public;

  ssl                  on;
  ssl_certificate      /home/ubuntu/fvhsdvhfd35/ssl-bundle1.crt;
  ssl_certificate_key  /home/ubuntu/fvhsdvhfd35/prvt.key;
  ssl_session_timeout  5m;
  ssl_protocols        TLSv1 TLSv1.1 TLSv1.2 SSLv3;
  ssl_ciphers          "EECDH+ECDSA+AESGCM EECDH+aRSA+AESGCM EECDH+ECDSA+SHA384 EECDH+ECDSA+SHA256 EECDH+aRSA+SHA384 EECDH+aRSA+SHA256 EECDH+aRSA+RC4 EECDH EDH+aRSA RC4 !aNULL !eNULL !LOW !3DES !MD5 !EXP !PSK !SRP !DSS";
  ssl_prefer_server_ciphers  on;

  location = /favicon.png {
    expires    max;
    add_header Cache-Control public;
  }

  location = /ZeroClipboard.swf {
    expires    max;
    add_header Cache-Control public;
  }

  location ~ ^/(assets)/  {
    gzip_static on;
    expires     max;
    add_header  Cache-Control public;
  }

  # disable gzip on all omniauth paths to prevent BREACH
  location ~ ^/auth/ {
    gzip off;
    passenger_enabled on;
  }

Do you have any idea how to get more than 400 request per second?


here is a log of Nginx with Passenger while processing 500 requests(with passenger_max_pool_size 30; and passenger_max_request_queue_size 1200;)

2017/07/06 01:58:32 [error] 11749#11749: *56391 connect() to unix:/tmp/passenger.e1PiPXp/agents.s/core failed (11: Resource temporarily unavailable) while connecting to upstream, client: 54.89.44.6, server: mydomain.us, request: "GET / HTTP/1.1", upstream: "passenger:unix:/tmp/passenger.e1PiPXp/agents.s/core:", host: "mydomain.us"
2017/07/06 01:58:32 [error] 11740#11740: *64104 connect() to unix:/tmp/passenger.e1PiPXp/agents.s/core failed (11: Resource temporarily unavailable) while connecting to upstream, client: 52.87.219.148, server: mydomain.us, request: "GET / HTTP/1.1", upstream: "passenger:unix:/tmp/passenger.e1PiPXp/agents.s/core:", host: "mydomain.us"
2017/07/06 01:58:32 [error] 11749#11749: *64251 connect() to unix:/tmp/passenger.e1PiPXp/agents.s/core failed (11: Resource temporarily unavailable) while connecting to upstream, client: 52.87.219.148, server: mydomain.us, request: "GET / HTTP/1.1", upstream: "passenger:unix:/tmp/passenger.e1PiPXp/agents.s/core:", host: "mydomain.us"
2017/07/06 01:58:32 [error] 11749#11749: *63289 connect() to unix:/tmp/passenger.e1PiPXp/agents.s/core failed (11: Resource temporarily unavailable) while connecting to upstream, client: 54.89.44.6, server: mydomain.us, request: "GET / HTTP/1.1", upstream: "passenger:unix:/tmp/passenger.e1PiPXp/agents.s/core:", host: "mydomain.us"
2017/07/06 01:58:32 [error] 11748#11748: *67786 connect() to unix:/tmp/passenger.e1PiPXp/agents.s/core failed (11: Resource temporarily unavailable) while connecting to upstream, client: 52.86.198.91, server: mydomain.us, request: "GET / HTTP/1.1", upstream: "passenger:unix:/tmp/passenger.e1PiPXp/agents.s/core:", host: "mydomain.us"
2017/07/06 01:58:32 [error] 11748#11748: *35057 connect() to unix:/tmp/passenger.e1PiPXp/agents.s/core failed (11: Resource temporarily unavailable) while connecting to upstream, client: 52.86.198.91, server: mydomain.us, request: "GET / HTTP/1.1", upstream: "passenger:unix:/tmp/passenger.e1PiPXp/agents.s/core:", host: "mydomain.us"
2017/07/06 01:58:32 [error] 11748#11748: *35166 connect() to unix:/tmp/passenger.e1PiPXp/agents.s/core failed (11: Resource temporarily unavailable) while connecting to upstream, client: 52.86.198.91, server: mydomain.us, request: "GET / HTTP/1.1", upstream: "passenger:unix:/tmp/passenger.e1PiPXp/agents.s/core:", host: "mydomain.us"
2017/07/06 01:58:32 [error] 11744#11744: *43208 connect() to unix:/tmp/passenger.e1PiPXp/agents.s/core failed (11: Resource temporarily unavailable) while connecting to upstream, client: 52.86.198.91, server: mydomain.us, request: "GET / HTTP/1.1", upstream: "passenger:unix:/tmp/passenger.e1PiPXp/agents.s/core:", host: "mydomain.us"
2017/07/06 01:58:32 [error] 11744#11744: *69130 connect() to unix:/tmp/passenger.e1PiPXp/agents.s/core failed (11: Resource temporarily unavailable) while connecting to upstream, client: 54.162.105.71, server: mydomain.us, request: "GET / HTTP/1.1", upstream: "passenger:unix:/tmp/passenger.e1PiPXp/agents.s/core:", host: "mydomain.us"
[ 2017-07-06 01:58:34.3865 11703/7fc703fff700 Ser/AcceptLoadBalancer.h:150 ]: Resuming accepting new clients

UPDATE

I got a solution. These changes of Nginx Configuration gave me 1000 request/second performance.

At first I put:

"65536" in /proc/sys/net/core/somaxconn
"65536" in /proc/sys/net/ipv4/tcp_max_syn_backlog

/etc/nginx/conf.d/m.conf:

    passenger_max_pool_size 90;
    passenger_socket_backlog 16384;

    #in server block
    #was listen 443 ssl spdy default;
    listen 443 ssl spdy default backlog=16384;
    passenger_max_request_queue_size 2300;
    ssl_session_cache shared:SSL:10m;

/etc/nginx/nginx.conf:

worker_rlimit_nofile 131072;

#in events block:
use epoll;
worker_connections 8192;

Another question

Average response time is about 6seconds for 1000 requests per second during 1 minute of tests. Any ideas how to improve average response time for this amount of requests?


UPDATE2

I change my Nginx config to enable Nginx Microcashing according to this blog, but I've got no better performance. 500 req/second gave me 5.1sec as average response time. About 900 req/sec - 5.5sec. However without caching I have 2.5sec for 500 requests and 5.6 sec for 900 requests.

/etc/nginx/nginx.conf:

    ...
    http {
      ...
      proxy_cache_path /tmp/cache keys_zone=one:10m levels=1:2 inactive=600s max_size=100m;
      ...
    }

/etc/nginx/conf.d/m.conf:

}

passenger_max_pool_size 90;
#passenger_pool_idle_time 20;
passenger_socket_backlog 16384;
server {
  listen 80;
  return 301 https://mydomain.eu$request_uri;
}

server {
  listen 443;
  server_name ~^(\w+)\.mydomain.eu$;
  return 301 https://mydomain.eu$request_uri;
}
server {
  listen 443 ssl spdy default backlog=16384;
  server_name mydomain.eu;

  ssl                  on;
  ssl_certificate      /home/ubuntu/fvhsdvhfd35/ssl-bundle1.crt;
  ssl_certificate_key  /home/ubuntu/fvhsdvhfd35/prvt.key;
  ssl_session_timeout  5m;
  ssl_protocols        TLSv1 TLSv1.1 TLSv1.2 SSLv3;
  ssl_ciphers          "EECDH+ECDSA+AESGCM EECDH+aRSA+AESGCM EECDH+ECDSA+SHA384 EECDH+ECDSA+SHA256 EECDH+aRSA+SHA384 EECDH+aRSA+SHA256 EECDH+aRSA+RC4 EECDH EDH+aRSA RC4 !aNULL !eNULL !LOW !3DES !MD5 !EXP !PSK !SRP !DSS";
  ssl_prefer_server_ciphers  on;

  ssl_session_cache shared:SSL:10m;

  location / {

    proxy_http_version 1.1; # Always upgrade to HTTP/1.1
    proxy_set_header Connection ""; # Enable keepalives
    proxy_set_header Accept-Encoding ""; # Optimize encoding
    proxy_pass http://127.0.0.1:81/;

    proxy_cache one;
    proxy_cache_lock on;
    proxy_cache_valid 200 1s;
    proxy_cache_use_stale updating;
  }
}
server {

  listen 81;
  server_name mydomain.eu;
  passenger_enabled on;

  passenger_max_request_queue_size 2300;
  gzip on;

  root /home/ubuntu/application/cversion/public;


  location = /favicon.png {
    expires    max;
    add_header Cache-Control public;
  }

  location = /ZeroClipboard.swf {
    expires    max;
    add_header Cache-Control public;
  }

  location ~ ^/(assets)/  {
    gzip_static on;
    expires     max;
    add_header  Cache-Control public;
  }

  # disable gzip on all omniauth paths to prevent BREACH
  location ~ ^/auth/ {
    gzip off;
    passenger_enabled on;
  }

}
John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
Vitali Grabovski
  • 183
  • 1
  • 3
  • 14
  • 1
    This question is quite broad, you may consider to add more details such Rails application logs and stats from EC2 such memory, disk I/O etc. In the description I see an essential misconfiguration (from performance point of view) such t2.* type instances which operates based on CPU credits, see for more details https://stackoverflow.com/questions/28984106/whats-is-cpu-credit-balance-in-ec2 – Anatoly Jul 05 '17 at 19:44
  • What's in the Nginx logs? – Anatoly Jul 05 '17 at 22:10
  • I have added some logs. This is how it looks while 500 requests/second are processing. So you say that t2 instance is not best choice for CPU performance related applications? And I can't improve its performance by better nginx configuration/rails configuration? Thanks for your advice! – Vitali Grabovski Jul 05 '17 at 23:07
  • 1
    Okay, given the Nginx error details now based on [Passenger documentation](https://www.phusionpassenger.com/library/config/nginx/reference/#passenger_socket_backlog) I'd try to increase number of worker connections: *events { worker_connections 4096; }* – Anatoly Jul 05 '17 at 23:26
  • 1
    To reduce CPU consumption as a way to handle unpredictable CPU performance can be using of [ssl_session_cache](http://nginx.org/en/docs/http/ngx_http_ssl_module.html#ssl_session_cache): *ssl_session_cache shared:SSL:10m;* – Anatoly Jul 05 '17 at 23:36
  • There is exactly the same number of working connections: "events { worker_connections 4096; multi_accept on; }" Maybe, increase twice? – Vitali Grabovski Jul 05 '17 at 23:42
  • yes, even quadruple it makes no harm, your type of instance has enough memory to keep thousands of connections. Given 8 cores let's also double amount of Nginx workers. And also *accept_mutex* should be set to off. – Anatoly Jul 05 '17 at 23:50
  • worker_connections was 4096 but passenger_socket_backlog is default size (which is 2048). Passenger documentation says that passenger_socket_backlog must be at least equal. I make them both same (4096). Same error. UPDATE. Then I set worker_connections=1024 and passenger_socket_backlog=4096. Still got "11: Resource temporarily unavailable" – Vitali Grabovski Jul 06 '17 at 00:03
  • 1
    Ok, not enough details: _Same error_ isn't helpful, it can't be exact the same. Every time performance and load testing occur having detailed logs (Nginx and application) is the key to resolve a problem. There is no point to speculate on what might cause issue until we see more data such Nginx version, its full configuration, evidence Nginx being restarted between tests, Rails app logs, what exact error etc, not just _some kind of 500_. – Anatoly Jul 06 '17 at 07:31
  • @Anatoly thanks for participation. Your suggestions were very useful. I got a solution described in my question above. Maybe you have ideas about another issue as well. – Vitali Grabovski Jul 07 '17 at 21:09
  • Now things turn to be much clearer, the backend response time is 6000ms under the load, certainly the backlog [queue size has to match it](https://www.nginx.com/blog/tuning-nginx/) in order to not drop requests. How to improve response time, well it has nothing to do with Nginx unfortunately. I'd recommend to try tools such Newrelic to start collecting data for further analysis. Keep in mind, having quicker response time from the backend (or leverage [short TTL caching](https://www.nginx.com/blog/benefits-of-microcaching-nginx/) on Nginx side) makes the Nginx backlog tuning less important. – Anatoly Jul 08 '17 at 07:42
  • any updates on that? – Anatoly Jul 11 '17 at 07:11
  • @Anatoly I am figuring out how to enable nginx microcashing for Rails App. – Vitali Grabovski Jul 11 '17 at 18:59
  • unless application requires cookie based authentication, the micro-caching technique is just a part of Nginx configuration. – Anatoly Jul 11 '17 at 19:04
  • @Anatoly I tried to apply advices from the article. Seems like no any boost. Furthermore it always gives about 5sec average time for any amount of requests per second. – Vitali Grabovski Jul 13 '17 at 23:25
  • Less likely Nginx spends 5 seconds to serve a response from the cache, share your Nginx configuration so we can find how to make it work – Anatoly Jul 13 '17 at 23:27
  • @Anatoly my Nginx configuration is in UPDATE2 of my question above. – Vitali Grabovski Jul 14 '17 at 12:50
  • **add_header X-Cache-Status $upstream_cache_status;** helps to check the cache status. _It does not cache responses with Cache-Control set to Private, No-Cache, or No-Store or with Set-Cookie in the response header._ – Anatoly Jul 14 '17 at 18:25
  • @Anatoly same average response time (still about 5sec). I've found an article about caching in Ruby applications. Seems like some source code changes required. – Vitali Grabovski Jul 14 '17 at 20:16

2 Answers2

2

For making these optimizations make sure you refer the link nginx blog and do consider the response time for each request (minimise it as much as you can using rails techniques). Also consider your database optimizations i.e using the right indexes and maximum number of database connections concurrently. As it is a multi level problem and configurations must be made at each level for best performance. Good luck :)

  • Thanks for your advices. I am new to Rails. Can you tell more about rails techniques? – Vitali Grabovski Jul 05 '17 at 19:08
  • 1
    You can look into this [query caching](https://blog.isnorcreative.com/2008/08/10/optimizing-ruby-on-rails-database-query-performance.html) and consider using caching views too. And are you getting 503 on more than 500 rps. And are you using newrelic to check if there is any bottle neck in your application/database that is causing the issue when you test for 500rps. – Akshay Chhikara Jul 07 '17 at 06:41
1

Nginx configuration can be reviewed according to official performance tuning recommendation: https://www.nginx.com/blog/tuning-nginx/

Tushar Pal
  • 503
  • 6
  • 16
  • Nginx can have handled thousands requests per second out of box, what exactly do you recommend to consider tuning, is the Nginx configuration a bottleneck in that case? – Anatoly Jul 05 '17 at 20:10
  • if you are looking for the optimized conf then here it is https://www.linode.com/docs/web-servers/nginx/configure-nginx-for-optimized-performance otherwise follow the docs and try to do all set timing. – Tushar Pal Jul 06 '17 at 06:41