1

I have 1000 records that need to hit an API endpoint that is rate limited. I want to make it so that there is only 5 calls on the URL at any given time so that I am not making 1000 requests simultaneously. How can I do this? I have the following:

var Promise = require("bluebird");
var geocoder = Promise.promisifyAll(require('geocoder'));
var fs = require('fs');
var async = require('async');
var parse = require('csv-parse/lib/sync');
var inputFile = './myaddresses.txt'
var file = fs.readFileSync(inputFile, "utf8");

var records = parse(file, {columns: true});
var promises = [];
for(var i = 0; i < records.length; i++) {
    var placeName = records[i]['Place Name'];
            promises.push(geocoder.geocodeAsync(placeName));    
}

Promises.all(promises).then(function(result) {
  result.forEach(function(geocodeResponse) {
  console.log(geocodeResponse);
  })
}
Rolando
  • 58,640
  • 98
  • 266
  • 407
  • What is the actual rate limit you have to stay under? Is it some number of requests/second? Or something else? – jfriend00 Jun 21 '17 at 03:25
  • I am not sure, this is Google Geocoding API. – Rolando Jun 21 '17 at 03:30
  • If you're trying to stay under the rate limit, you should do a little Google research to see what the rate limit is and how it's measured. Without that info, you're just doing a guess and test solution which will never be very optimized and may be inconsistent. – jfriend00 Jun 21 '17 at 03:33
  • It looks to me [here](https://developers.google.com/maps/documentation/geocoding/usage-limits) like it's 50 requests/sec, 2500 requests/day for std account. You can buy access to more. – jfriend00 Jun 21 '17 at 03:34

2 Answers2

0

To limit the number of concurrent requests that are in-flight at once, I'd recommend using Bluebird's Promise.map() which offers a concurrency option. It will do all of the following for you:

  1. Iterate your array
  2. Limit the number of concurrent requests to whatever you set the concurrency option to
  3. Collect all the results in order in the final results array

Here's how you would use it:

const Promise = require('bluebird');

Promise.map(records, r => {
    let placeName = r['Place Name'];
    return geocoder.geocodeAsync(placeName));
}, {concurrency: 5}).then(results => {
    // all results here
}).catch(err => {
    // process error here
});

Note: Rate limiting is not usually strictly the same as number of concurrent requests. Limiting the number of concurrent requests will make it more likely that you stay under a rate limit, but won't guarantee it. There are specific rate limiting modules that can manage to a rate limit more directly.


You can add a delay to each request using Bluebird's .delay().

const Promise = require('bluebird');

Promise.map(records, r => {
    let placeName = r['Place Name'];
    return geocoder.geocodeAsync(placeName)).delay(500);
}, {concurrency: 5}).then(results => {
    // all results here
}).catch(err => {
    // process error here
});

A classic algorithm for dealing with some types of rate limits is called the leaky bucket algorithm.


If your limit is 50 requests/sec, then you can just make sure that your concurrency number times your delay value never allows more than 50/sec.

jfriend00
  • 683,504
  • 96
  • 985
  • 979
  • Rate limiting appears to still be a problem, is there a way to say, add a delay by seconds after each 5? – Rolando Jun 21 '17 at 03:20
  • @Rolando - I've shown you how to add a delay to each request (see second code block), but as my answer says, concurrency and delay are not strictly how you rate limits are usually measured. A low enough concurrency value with a long enough delay will likely get you under the rate limit value, but a more comprehensive solution would be to actually manage the number of requests to exactly avoid however the rate limit is being measured. If you can share exactly what the rate limit is, then we could help you more specifically for that. There are modules that can manage to a specific rate limit. – jfriend00 Jun 21 '17 at 03:23
  • @Rolando - FYI, a classic rate limiting algorithm is called the [leaky bucket algorithm](https://en.wikipedia.org/wiki/Leaky_bucket). – jfriend00 Jun 21 '17 at 03:31
  • @Rolando - I added some info on how to keep things below 50 requests/sec. – jfriend00 Jun 21 '17 at 03:35
0

Use waterfall pattern without a library and use a race condition to resolve on each iteration with reduce. And you can limit the number of calls by specifying the length of the array in Array.from.

var promise = Array.from({ length: 5 }).reduce(function (acc) {
  return acc.then(function (res) {
    return run().then(function (result) {
      res.push(result);
      return res;
    });
  });
}, Promise.resolve([]));


var guid = 0;
function run() {
  guid++;
  var id = guid;
  return new Promise(resolve => {
    // resolve in a random amount of time
    setTimeout(function () {
      console.log(id);
      resolve(id);
    }, (Math.random() * 1.5 | 0) * 1000);
  });
}
Rick
  • 1,035
  • 10
  • 18