21

We have a node.js web server that makes some outgoing http requests to an external API. It's running in docker using dokku.
After some time of load (30req/s) these outgoing requests aren't getting responses anymore.

Here's a graph I made while testing with constant req/s: graph incoming and outgoing is the amount of concurrent requests (not the number of initialized requests). (It's hard to see in the graph, but it's fairly constant at ~10 requests for each.)
response time is for external requests only. You can clearly see that they start failing all of a sudden (hitting our 1000ms timeout).


The more req/s we send, the faster we run into this problem, so we must have some sort of limit we're getting closer to with each request.


I used netstat -ant | tail -n +3 | wc -l on the host to get the number of open connections, but it was only ~450 (most of them TIME_WAIT). That shouldn't hit the socket limit. We aren't hitting any RAM or CPU limits, either.


I also tried running the same app on the same machine outside docker and it only happens in docker.

jomo
  • 14,121
  • 4
  • 29
  • 30
  • Have you tried running it inside Docker but not inside Dokku? – blacklabelops Jul 16 '15 at 22:45
  • Define 'currently processed'. 1000ms is far too short for a request timeout. Try something sensible like ten seconds. – user207421 Jul 17 '15 at 03:28
  • @maybeg I haven't because the dokku guys told me they aren't touching any network things. I will try that later. – jomo Jul 17 '15 at 07:59
  • 1
    @EJP 1000ms is actually a lot for an external API request. The result isn't any different when I make it longer though. (And none of our users waits 10+ seconds for an HTTP request to finish) – jomo Jul 17 '15 at 07:59
  • @jomo Actually it isn't a lot. You're within range of theTCP retry timeouts. It's too short. – user207421 Jul 19 '15 at 07:48
  • @jomo - Any info on running it without Dokku? I'm working on a project using Docker now and would love to know if this is or isn't a docker issue. – blockcipher Jul 20 '15 at 13:18
  • Might be related to this question : http://stackoverflow.com/questions/30840817/docker-container-http-requests-limit – Regan Jul 23 '15 at 13:10
  • @Regan We indeed have local connections to redis (running in another container). I was also suspecting that it might be causing the issue, but the problem in docker persisted after removing redis. – jomo Jul 23 '15 at 14:56

1 Answers1

4

It could be due to the Docker userland proxy. If you are running a recent version of Docker, try running the daemon with the --userland-proxy=false option. This will make Docker handle port forwarding with just iptables and there is less overhead.

Mark Duncan
  • 404
  • 3
  • 7
  • I just tried this, it doesn't seem to make a difference. – jomo Jul 20 '15 at 01:11
  • @jomo, in that case, it could be a kernel configuration. Maybe this SO question/answer could give you something to look in to http://stackoverflow.com/questions/410616/increasing-the-maximum-number-of-tcp-ip-connections-in-linux – Mark Duncan Jul 21 '15 at 01:47
  • that's what we initially thought was the problem, but it wouldn't explain why it doesn't happen outside docker (where we actually had more connections than in docker) – jomo Jul 21 '15 at 08:14
  • Your answer didn't really solve my problem, but I rather give the bounty to *someone* instead of letting it disappear forever. Have funs :) – jomo Jul 23 '15 at 20:45
  • @jomo can you post output of 'sysctl net' in the container and out of the container? I'm pretty sure the network namespace can have different sysctl parameters for network stuff. – flumpb Jul 23 '15 at 20:53
  • @Mack [inside container](http://pastebin.com/raw.php?i=SS8EDSxT), [outside container](http://pastebin.com/raw.php?i=UBZQ7KJ2) – jomo Jul 23 '15 at 21:30
  • @jomo anything stick out in the logs or dmesg when the outbound connections start getting dropped? – flumpb Jul 24 '15 at 14:04
  • @jomo I'm wondering if this answer might help you: http://stackoverflow.com/questions/17033631/node-js-maxing-out-at-1000-concurrent-connections I'm starting to think it might be nodejs instead of Docker, but maybe something about Docker triggers it. – Mark Duncan Jul 24 '15 at 23:15
  • @MarkDuncan ulimit doesn't seem to be the issue, we tested that as well :| – jomo Jul 25 '15 at 01:38
  • @jomo I'm at a loss. If you figure this out please post solution. Best of luck. – flumpb Jul 26 '15 at 02:25