0

I built a script in Node to loop over 10k+ records and make HTTP requests per record. This works great as long as i set the maxSockets to 20. However, the script starts slowing down around 2500 and then eventually just stops. No errors. Memory and CPU usage don't spike in Activity Monitor. I rewrote the same script in Python and it works fine but is much slower. Any ideas?

Here is the basic idea of the script:

var fs = require('fs'),
    https = require('https');
    input = fs.readFileSync('./csv.csv'),
    parse = require('csv-parse');

https.globalAgent.maxSockets = 20;

parse(input, {}, (e, o) => {
  o.forEach((l, i) => {
    var req = https.request({
      host: 'some.host.com',
      path: `/some/path/${l[4]}`,
      port: 443,
      method: 'GET'
    }, (res) => {
      if(res.statusCode === 200) {
        fs.appendFile('./results.csv', `${l}\n`);
      }
    });

    req.end();
  });
});
Joe
  • 6,401
  • 3
  • 28
  • 32
  • Might be because of Garbage collector. Set lower interval between cycles. – georoot Mar 03 '17 at 17:29
  • remember to use flag --expose-gc in node – georoot Mar 03 '17 at 17:30
  • @georoot `--expose-gc` doesn't reveal any new information. I attached a couple of handlers to `process.on('uncaughtException')...` and `SIGTERM` but it looks like it is exiting with `0`. If I drop the `maxSockets` to 1, it will only process two rows before exiting as opposed to ~2500 with `maxSockets` at 20. – Joe Mar 03 '17 at 17:38
  • Also, anything higher than 20 on `maxSockets` will blow up with `Error: getaddrinfo ENOTFOUND`. – Joe Mar 03 '17 at 17:40
  • How about testing how many sockets are available before calling `http.request()`, and wait if there are no sockets available ? Or does node automatically take care of this ? – Lorenz Meyer Mar 03 '17 at 17:43
  • Do you realize that you are trying to launch 10,000 simultaneous http requests? `https.request()` is non-blocking so your `.forEach()` loop will try to run to completion starting every single http request before a single http request gets a chance to finish. You will also probably overwhelm the server. – jfriend00 Mar 03 '17 at 18:24
  • See [How to run thousands of http requests from nodejs](http://stackoverflow.com/questions/39141614/run-1000-requests-so-that-only-10-runs-at-a-time/39154813#39154813). – jfriend00 Mar 03 '17 at 18:31
  • @jfriend00 Ah. I thought that setting `maxSockets` would control the amount of simultaneous requests. I will give one of these solutions a try. – Joe Mar 03 '17 at 19:18
  • `maxSockets` will control how many requests can actually be done at once, but it won't control how many your code is trying to do at once. – jfriend00 Mar 03 '17 at 19:27
  • Possible duplicate of [Run 1000 requests so that only 10 runs at a time](http://stackoverflow.com/questions/39141614/run-1000-requests-so-that-only-10-runs-at-a-time) – Lorenz Meyer Mar 03 '17 at 21:46

0 Answers0