1

I need to perform a million API calls in under an hour (The server can handle this much traffic), for this, I'm using node to run multiple requests in parallel, but when I try to run ~1000 concurrent requests I keep getting these errors:

EAI_AGAIN

{ [Error: getaddrinfo EAI_AGAIN google.com:80]
  code: 'EAI_AGAIN',
  errno: 'EAI_AGAIN',
  syscall: 'getaddrinfo',
  hostname: 'google.com',
  host: 'google.com',
  port: 80 }

ECONNRESET

{
    [Error: read ECONNRESET] code: 'ECONNRESET',
    errno: 'ECONNRESET',
    syscall: 'read'
}

How can I prevent this from happening, without reducing the requests to 500?

Here's a code sample that works perfectly when running 500 requests at a time, but fails when the limit is over 1000. (Your limit may be different).

"use strict";
const http = require('http');
http.globalAgent.maxSockets = Infinity;
const async = require('async');
const request = require("request");

let success = 0;
let error = 0;

function iterateAsync() {
    let rows = [];
    for (let i = 0; i < 500; i++) {
        rows.push(i);
    }

    console.time("Requests");
    async.each(
        rows, 
        (item, callback) => get(callback), 
        (err) => {
            console.log("Failed: " + error);
            console.log("Success: " + success);
            console.timeEnd("Requests");
    });
}


function get(callback) {

    request("http://example.com", (err, response, body) => {
        if (err) {
            console.log(err);
            error++;
            return callback();
        }

        success++;
        callback();
    });
}

iterateAsync();

I added http.globalAgent.maxSockets = Infinity; even though is the default value.

500 requests

enter image description here

1000 requests

enter image description here

Aditional info:

I'm running the tests on ubuntu 14.04 & 15.04 and I modified file-max, tcp_fin_timeout, tcp_tw_reuse, ip_local_port_range default values in /etc/sysctl.conf as described in this post

  • file-max: 100000
  • tcp_fin_timeout: 15
  • tcp_tw_reuse: 1
  • ip_local_port_range: 10000 65000

Added following lines in /etc/security/limits.conf

*     soft    nofile          100000
*     hard    nofile          100000

With this values, I'm still getting the error.

I've read all other similar posts:

Is there a way I can know the exact number of concurrent requests that my system can manage?

Community
  • 1
  • 1
Marcos Casagrande
  • 37,983
  • 8
  • 84
  • 98
  • *The server can handle this much traffic.* Are you affiliated with Google in some way, or are you just assuming that? – Frédéric Hamidi Apr 10 '16 at 18:07
  • I'm not making calls to google... The sample is just an example so you can run it. And I have made a 1M request in an hour, but from three differents, servers since I can't get more requests from 1 server without getting this error. – Marcos Casagrande Apr 10 '16 at 18:08
  • So you want us to DDoS Google instead? More seriously, I'm more worried about the connection timeout than about the other error -- something seems to be dropping connections on the floor (or blacklisting your client). – Frédéric Hamidi Apr 10 '16 at 18:09
  • @FrédéricHamidi 500 requests to google Is far far away from a DDoS attack, And it's just a sample, I'll change google, with example.com if you want to get picky. – Marcos Casagrande Apr 10 '16 at 18:11
  • 1
    EAI_AGAIN appears to be a "temporary failure in DNS lookup". So, perhaps you are swamping some DNS resource, local or otherwise. If your million connections are all to the same set of hosts, perhaps you can resolve their IP address once and just use the IP address from then on. – jfriend00 Apr 10 '16 at 18:12
  • @Marcos, 1000 requests from possibly dozens of machines (everyone on SO who tries to run your script) possibly several times (to get a better grasp on the problem) will probably trigger some defense on Google's side. I wouldn't want these users to get burned. – Frédéric Hamidi Apr 10 '16 at 18:13
  • @jfriend00 I did google that, but can't get a way to fix it, – Marcos Casagrande Apr 10 '16 at 18:13
  • @FrédéricHamidi I got yout point, I edited the answer a minute ago. – Marcos Casagrande Apr 10 '16 at 18:14
  • @jfriend00 I did put the entry in hosts file, as someone pointed out in other SO question, but still the same problem. And I can't strictly use the IP address alone, because without the host, the server returns 404. – Marcos Casagrande Apr 10 '16 at 18:19
  • What OS is this on? It smells like a local overload of the DNS resolver, even if using the hosts file. You could use the IP address for the connection and avoid all DNS, but still have the host name in the headers. A connection is only made to an IP address so if the server requires the host name, it must be getting it in an HTTP header. – jfriend00 Apr 10 '16 at 18:23
  • Ubuntu 14.04. I'll try what you're saying in 1 minute, and get back to you. – Marcos Casagrande Apr 10 '16 at 18:24
  • @jfriend00 Tried doing that, but then I get ECONNRESET error. `{ [Error: read ECONNRESET] code: 'ECONNRESET', errno: 'ECONNRESET', syscall: 'read' }` – Marcos Casagrande Apr 10 '16 at 18:42

1 Answers1

2

Here's how I solved it following @jfriend00 input.

I resolve the host IP address using dns.resolve4 and then perform the requests using the IP, that way the error is gone, and I can perform more requests in less time.

"use strict";

const dns = require("dns");
const http = require('http');

http.globalAgent.maxSockets = Infinity;

const async = require('async');
const request = require("request");

let success = 0;
let error = 0;

function iterateAsync() {
    let rows = [];
    for (let i = 0; i < 500; i++) {
        rows.push(i);
    }

    console.time("Requests");

    dns.resolve4("example.com", (err, addresses) =>{ //Resolve host and obtain IP address

        if(err) //Handle error
            return false;

        async.each(
            rows,
            (item, callback) => get(`http://${addresses[0]}`, callback),
            (err) => {
                console.log("Failed: " + error);
                console.log("Success: " + success);
                console.timeEnd("Requests");
            });

    });


}

function get(host, callback) {

    request(host, (err, response, body) => {
        if (err) {
            console.log(err);
            error++;
            return callback();
        }

        success++;
        callback();
    });
}


//Start requests
iterateAsync();

Output:

Failed: 0
Success: 500
Requests: 6295.799ms
Marcos Casagrande
  • 37,983
  • 8
  • 84
  • 98
  • Hello Marcos, do you know a way to improve this, even more? I'm getting 66 request processed per second, but still too slow for the amount I need ~150m, do you know how to increase that number? what else can I add? parallelism? – Jumper Apr 29 '21 at 02:00