2

I've got a method that returns a promise and internally that method makes a call to an API which can only have 20 requests every minute. The problem is that I have a large array of objects (around 300) and I would like to make a call to the API for each one of them.

At the moment I have the following code:

    const bigArray = [.....];

    Promise.all(bigArray.map(apiFetch)).then((data) => {
      ...
    });

But it doesnt handle the timing constraint. I was hoping I could use something like _.chunk and _.debounce from lodash but I can't wrap my mind around it. Could anyone help me out ?

Felix Kling
  • 795,719
  • 175
  • 1,089
  • 1,143
Ignacio A. Rivas
  • 646
  • 7
  • 13

2 Answers2

8

If you can use the Bluebird promise library, it has a concurrency feature built in that lets you manage a group of async operations to at most N in flight at a time.

var Promise = require('bluebird');
const bigArray = [....];

Promise.map(bigArray, apiFetch, {concurrency: 20}).then(function(data) {
    // all done here
});

The nice thing about this interface is that it will keep 20 requests in flight. It will start up 20, then each time one finishes, it will start another. So, this is a potentially more efficient than sending 20, waiting for all to finish, sending 20 more, etc...

This also provides the results in the exact same order as bigArray so you can identify which result goes with which request.

You could, of course, code this yourself with generic promises using a counter, but since it is already built in the the Bluebird library, I thought I'd recommend that way.

The Async library also has a similar concurrency control though it is obviously not promise based.


Here's a hand-coded version using only ES6 promises that maintains result order and keeps 20 requests in flight at all time (until there aren't 20 left) for maximum throughput:

function pMap(array, fn, limit) {
    return new Promise(function(resolve, reject) {
        var index = 0, cnt = 0, stop = false, results = new Array(array.length);

        function run() {
            while (!stop && index < array.length && cnt < limit) {
                (function(i) {
                    ++cnt;
                    ++index;
                    fn(array[i]).then(function(data) {
                        results[i] = data;
                        --cnt;
                        // see if we are done or should run more requests
                        if (cnt === 0 && index === array.length) {
                            resolve(results);
                        } else {
                            run();
                        }
                    }, function(err) {
                        // set stop flag so no more requests will be sent
                        stop = true;
                        --cnt;
                        reject(err);
                    });
                })(index);
            }
        }
        run();
    });
}   

pMap(bigArray, apiFetch, 20).then(function(data) {
    // all done here
}, function(err) {
    // error here
});

Working demo here: http://jsfiddle.net/jfriend00/v98735uu/

jfriend00
  • 683,504
  • 96
  • 985
  • 979
  • 2
    @IgnacioARivas - Added hand-coded version that maintains result order and keeps 20 requests in flight at all time and does not use an external library. – jfriend00 Oct 27 '15 at 22:21
1

You could send 1 block of 20 requests every minute or space them out 1 request every 3 seconds (latter probably preferred by the API owners).

function rateLimitedRequests(array, chunkSize) {
  var delay = 3000 * chunkSize;
  var remaining = array.length;
  var promises = [];
  var addPromises = function(newPromises) {
    Array.prototype.push.apply(promises, newPromises);
    if (remaining -= newPromises.length == 0) {
      Promise.all(promises).then((data) => {
        ... // do your thing
      });
    }
  };
  (function request() {
    addPromises(array.splice(0, chunkSize).map(apiFetch));
    if (array.length) {
      setTimeout(request, delay);
    }
  })();
}

To call 1 every 3 seconds:

rateLimitedRequests(bigArray, 1);

Or 20 every minute:

rateLimitedRequests(bigArray, 20);

If you prefer to use _.chunk and _.debounce1 _.throttle:

function rateLimitedRequests(array, chunkSize) {
  var delay = 3000 * chunkSize;
  var remaining = array.length;
  var promises = [];
  var addPromises = function(newPromises) {
    Array.prototype.push.apply(promises, newPromises);
    if (remaining -= newPromises.length == 0) {
      Promise.all(promises).then((data) => {
        ... // do your thing
      });
    }
  };
  var chunks = _.chunk(array, chunkSize);  
  var throttledFn = _.throttle(function() {
    addPromises(chunks.pop().map(apiFetch));
  }, delay, {leading: true});
  for (var i = 0; i < chunks.length; i++) {
    throttledFn();
  }
}

1You probably want _.throttle since it executes each function call after a delay whereas _.debounce groups multiple calls into one call. See this article linked from the docs

Debounce: Think of it as "grouping multiple events in one". Imagine that you go home, enter in the elevator, doors are closing... and suddenly your neighbor appears in the hall and tries to jump on the elevator. Be polite! and open the doors for him: you are debouncing the elevator departure. Consider that the same situation can happen again with a third person, and so on... probably delaying the departure several minutes.

Throttle: Think of it as a valve, it regulates the flow of the executions. We can determine the maximum number of times a function can be called in certain time. So in the elevator analogy.. you are polite enough to let people in for 10 secs, but once that delay passes, you must go!

Community
  • 1
  • 1
arcyqwerty
  • 10,325
  • 4
  • 47
  • 84
  • Brilliant, thats what I was after. Thanks mate. – Ignacio A. Rivas Oct 27 '15 at 21:59
  • I just updated some extra code so that your promises work correctly, assuming you need to have ALL of the data to process at once. Otherwise the old code is still able to handle data as it comes in independently. – arcyqwerty Oct 27 '15 at 22:02
  • Note: I don't think this preserves the order of the data if that's important. It appears to push data into the output array as it arrives, without regard for the order of the original requests. – jfriend00 Oct 27 '15 at 22:05
  • You're correct, it would not preserve the order if the requests are async and happen to be returned in a different order. – arcyqwerty Oct 27 '15 at 22:06
  • Why are you using `Array.prototype.push.apply(data, newData)` instead of `data.push(newData)`? – jfriend00 Oct 27 '15 at 22:07
  • `newData` would be an array of the data items. If I were to call `data.push` it would push the entire array object onto `data` rather than the items contained in `newData` – arcyqwerty Oct 27 '15 at 22:08
  • Assuming `data = [1, 2, 3]` and `newData = [4, 5, 6]`, `data.push(newData)` would yield `[1,2,3, [4, 5, 6]]` rather than the intended `[1,2,3,4,5,6]` – arcyqwerty Oct 27 '15 at 22:08
  • OK, I see what you're doing with that. I didn't realize `newData` was an array. – jfriend00 Oct 27 '15 at 22:09
  • @jfriend00: Edited to fix Promise ordering. Can you verify? – arcyqwerty Oct 27 '15 at 22:26
  • Why are you using a timer? The timer seems completely arbitrary. You should be sending more requests when prior requests finish, not sending more requests on a timer. You can see how I do that in my answer. In looking at your first code some more, I don't see how it keeps there from being more than 20 requests in flight at the same time. It seems you're only guessing with a timer. – jfriend00 Oct 27 '15 at 22:28
  • On further examination, it does not look the first code block works at all. It does not wait for requests to finish before sending more, it simply sends more on an arbitrary timer delay. – jfriend00 Oct 27 '15 at 22:35
  • It sends at 20qpm, it does not guarantee that long-running requests will be out of queue within that time. As I understand it, many APIs quota you based on the number of requests issued in a given time frame, not how quickly they are processed by the server. So, for example, sending a request every 3 seconds yields 20 requests a minute, regardless of how quickly they run. – arcyqwerty Oct 28 '15 at 02:31