5

On performance testing my node.js socket.io app it seems unable to handle the desired amount of concurrent websocket requests.

I am testing the application in a Docker environment with the following specs:

CPUs: 2 Ram: 4 GB

The application is stripped down to a bare minimum that only accepts websocket connections with socket.io + express.js.

I perform the tests with the help of artillery.io, the test scenario is:

config:
  target: "http://127.0.0.1:5000"
  phases:
    - duration: 100
      arrivalRate: 20
scenarios:
    - engine: "socketio"
      flow:
      - emit:
          channel: "echo"
          data: "hello"        
      - think: 50

Report:

Summary report @ 16:54:31(+0200) 2018-07-30
  Scenarios launched:  2000
  Scenarios completed: 101
  Requests completed:  560
  RPS sent: 6.4
  Request latency:
    min: 0.1
    max: 3
    median: 0.2
    p95: 0.5
    p99: 1.4
    Scenario counts:
    0: 2000 (100%)
  Codes:
    0: 560
  Errors:
    Error: xhr poll error: 1070
    timeout: 829

So I get a lot of xhr poll errors. While I monitor the CPU + mem stats the highest value for the CPU is only 43,25%. Memory will only get as high as 4%.

Even when I alter my test to an arrival rate of 20 over a timespan of 100 seconds I still get XHR poll errors.

So are these test numbers beyond the capability of nodejs + socket.io with this specs or is something else nog working as expected ? Perhaps the docker environment or the Artillery software ?

any help or suggestions would be appreciated !

side note: Already looked into nodejs clustering for scaling but like to get the most out of one process first.

Update 1

After some more testing with a websocket stresstest script found here: https://gist.github.com/redism/11283852 It seems I hit some sort of limit when I use an arrival rate higher than 50 or want to establish more connections then +/- 1900.

Until 1900 connections almost each connection gets established but after this number the XHR poll error grows exponential.

Still no high CPU or Memory values for the docker containers.

The XHR poll error in detail:

Error: xhr poll error
at XHR.Transport.onError (D:\xxx\xxx\api\node_modules\engine.io-client\lib\transport.js:64:13)
at Request.<anonymous> (D:\xxx\xxx\api\node_modules\engine.io-client\lib\transports\polling-xhr.js:128:10)
at Request.Emitter.emit (D:\xxx\xxx\api\node_modules\component-emitter\index.js:133:20)
at Request.onError (D:\xxx\xxx\api\node_modules\engine.io-client\lib\transports\polling-xhr.js:309:8)
at Timeout._onTimeout (D:\xxx\xxx\api\node_modules\engine.io-client\lib\transports\polling-xhr.js:256:18)
at ontimeout (timers.js:475:11)
at tryOnTimeout (timers.js:310:5)
at Timer.listOnTimeout (timers.js:270:5) type: 'TransportError', description: 503 

Update 2

Changing the transport to "websocket" in the artillery test gives some better performance.

Testcase:

config:
  target: "http://127.0.0.1:5000"
  socketio:
    transports: ["websocket"]
  phases:
    - duration: 20
      arrivalRate: 200
scenarios:
    - engine: "socketio"
      flow:
      - emit:
          channel: "echo"
          data: "hello"        
      - think: 50

Results: Arrival rate is not longer the issue but I hit some kind of limit at 2020 connections. After that it gives a "Websocket error".

So is this a limit on Windows 10 and can you change it ? Is this limit the reason why the tests with long-polling perform so badly

Jeroen
  • 1,991
  • 2
  • 16
  • 32
  • We use socket.io in HarperDB for our clustering mechanism and our throughput is A LOT higher than that. I saw that error a lot when I had misconfigured my response. I found socket.io very hard to debug as we were running the client and server in the same application, but that error can occur from a bad handshake. – sgoldberg Jul 30 '18 at 19:43
  • At the moment the only socket.io related code i have is "io.on('connection', function...." so i only accept socket connections nothing else. Any suggestions ? – Jeroen Jul 31 '18 at 06:51
  • I think It's maybe some throttle with docker. You can check the limit setting for your docker container. I artillery with the settings of duration: 100, arrivalRate: 50 on the same Mac Pro machine. Everything works normally. – Dat Tran Aug 02 '18 at 09:56
  • Resources are limited to 2 core & 4 gb ram, where can you set some kind throttle ? Are using Docker when doing this test? With the same resources ? – Jeroen Aug 02 '18 at 10:02
  • Can you just try with transport websocket only from clientside? It seems issue with network or docker because for plain websocket no.must be much more. – Rohit Harkhani Aug 07 '18 at 05:17
  • You could try using netstat on your docker instance, to see how many active sockets you have during your test. My guess would be you're hitting a default OS limit on the number of concurrent open sockets on the OS running within your docker instance... – moilejter Aug 08 '18 at 05:57
  • @RohitHarkhani So when i change the transport to "websocket" on the client side the test does work much better. The arrival rate is not longer the issue but i hit some kind of limit @ 2020 connections (see update 2 in question). Also i really prefer using long-polling in production. – Jeroen Aug 08 '18 at 08:40
  • @moilejter Which limit am I hitting because the related ones that i could find: ulimit & max socket on port connections are unlimited and high enough. – Jeroen Aug 08 '18 at 08:43
  • 2020 is limit what is the behaviour? I have also used artillery so i know artillery is heavy and to segregate limit between client and server i suggest please hit it using multiple artillery instance. To explain in details i suggested websocket approach because i thought it is falling back to polling because of artillery issue. – Rohit Harkhani Aug 08 '18 at 09:09
  • When using one instance the connections go up to 2020 very quickly after that i get "websocket error" in artillery. When splitting the load over 4 instances (on the same client) the performance gets worse (instant webscoket errors on all the instances). – Jeroen Aug 08 '18 at 09:29

0 Answers0