0

I'm looping through an array and making an API call for each member using async/await, I then push the result into another array which is returned.

// My current function
async requestForEach(repos) {
    const result = [];
    for (const repo of repos) {
        result.push(await this.doSomething(repo.name));
    }
    return result;
}

// doSomething()
const AWS = require('aws-sdk');
const codecommit = new AWS.CodeCommit();
async doSomething(repoName){
    return (await codecommit.listBranches({
        repoName
    }).promise()).branches;
}

My issue is I'm getting rate limited. If I catch and print the error I get..

ThrottlingException: Rate exceeded {
  // Call stack here
  code: 'ThrottlingException',
  time: 2020-08-16T15:52:56.632Z,
  requestId: '****-****-****-****-****',
  statusCode: 400,
  retryable: true
}

Documentation for the API I'm using can be found here - https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/CodeCommit.html#listBranches-property

I looked into options and this async library seemed to be the popular option.

Using async.queue()..

Tasks added to the queue are processed in parallel (up to the concurrency limit). If all workers are in progress, the task is queued until one becomes available. Once a worker completes a task, that task's callback is called.

// create a queue object with concurrency 2
var q = async.queue(function(task, callback) {
    console.log('hello ' + task.name);
    callback();
}, 2);

Obviously I cant get the value back from within the callback function, so how should I approach this problem?

newprogrammer
  • 600
  • 10
  • 22
  • 1
    Do you need to make the calls in sequence? Or is it fine to make parallel calls? – Prathap Reddy Aug 16 '20 at 12:33
  • parallel calls is fine – newprogrammer Aug 16 '20 at 12:34
  • No, you don't need to use async.js (and if you still use it, make sure not to use callback style). Your sequential iteration is fine, all you need to do is to add a delay when you got a `ThrottlingException`. – Bergi Aug 16 '20 at 14:33
  • Hey @Bergi, could you please elaborate on the `delay` part. Would love to hear solution from experts like you. It will help us/others to apply better solution in similar situation. Is it like `Promise` with `setTimeout`? Thanks in advance – Prathap Reddy Aug 16 '20 at 15:07
  • @PrathapReddy Yes, that's what I meant by delay. – Bergi Aug 16 '20 at 15:27
  • Do this mean, in the same code, we just need to add `try/catch` around `this.doSomething()`, catch the `ThrottlingException` and add `delay` in `catch` then `continue` the execution? Pardon me if I interpreted wrongly here. @Bergi – Prathap Reddy Aug 16 '20 at 15:35
  • @PrathapReddy Yes. In sensible APIs, a `ThrottlingException` would even contain the time that one should wait before making the next reqest. (Though instead of continuing with the code, you'd want to retry the last request you made, not sure if you meant to have a loop around each request that `break`s on a result and `continue`s on an error) – Bergi Aug 16 '20 at 15:37
  • Any chance you could put together a quick example please @Bergi? – newprogrammer Aug 16 '20 at 15:41
  • 1
    @newprogrammer Can you link the docs of API you are using and how its `ThrottlingExceptions` look? Or share the definition of the `this.doSomething` method? Then maybe I can write a tailored answer. – Bergi Aug 16 '20 at 15:47
  • I've updated the original post - thanks @Bergi – newprogrammer Aug 16 '20 at 16:02
  • Thanks. I came up with incongruent results whether the AWS SDK would retry throttled requests on its own or not https://forums.aws.amazon.com/thread.jspa?messageID=860993&tstart=0 https://github.com/aws/aws-sdk-js/pull/2895 https://github.com/aws/aws-sdk-js/issues/1749 https://stackoverflow.com/q/43611099/1048572 https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Config.html#maxRetries-property – Bergi Aug 16 '20 at 16:31
  • @newprogrammer, I have updated my answer. Hope it helps you in other similar use cases. Thanks :) – Prathap Reddy Aug 17 '20 at 17:57

3 Answers3

2

The sequential for … of loop looks good to me. You can add a default delay for each iteration to make it slower, but you can also simply retry requests later when they fail because of throttling. Notice that this approach only works well when you have only a single source of requests in your app (not multiple concurrent calls to requestForEach), otherwise you'd probably need global coordination.

async doSomething(repoName) {
    while (true) {
        try {
            const data = await codecommit.listBranches({
                repoName
            }).promise();
            return data.branches;
        } catch(err) {
            if (err.code == 'ThrottlingException') { // if (err.retryable) {
                await delay(err.retryDelay ?? 1000);
                continue;
            } else {
                throw err;
            }
        }
    }
}
function delay(time) {
    return new Promise(resolve => {
        setTimeout(resolve, time);
    });
}

Instead of the while (true) loop a recursive approach might look nicer. Notice that in production code you'll want to have a limit on the number of retries so that your loop never runs infinitely.

Bergi
  • 630,263
  • 148
  • 957
  • 1,375
1

Looks like you want parallelLimit.

It takes an optional callback which receives the results.

From the docs.

https://caolan.github.io/async/v3/docs.html#parallelLimit

callback function An optional callback to run once all the functions have completed successfully. This function gets a results array (or object) containing all the result arguments passed to the task callbacks. Invoked with (err, results).

Example:

// run 'my_task' 100 times, with parallel limit of 10

  var my_task = function(callback) { ... };
  var when_done = function(err, results) { ... };

  // create an array of tasks
  var async_queue = Array(100).fill(my_task);

  async.parallelLimit(async_queue, 10, when_done);

Taken from: how to use async.parallelLimit to maximize the amount of (paralle) running processes?

CountZero
  • 6,171
  • 3
  • 46
  • 59
0

You can make use of Promise.all as below to reduce the wait time for your API calls as below

async requestForEach(repos) {
  return Promise.all(repos.map(repo => this.doSomething(repo.value)));
}

Since you are getting the rate limit issue with total number of calls, you can make use of libraries like es6-promise-pool to manage concurrent requests (5/10 - based on your requirement).

And update the this.doSomething with recursion and MAX_RETRIES (Control the MAX_RETRIES from environment variable) limit as below

async doSomething(repoName, retries = 0) {
    try {
        const data = await codecommit.listBranches({
            repoName
        }).promise();
        return data.branches;
    } catch(err) {
        if (err.code == 'ThrottlingException' && retries <= MAX_RETRIES) {
            await delay(err.retryDelay ?? 1000); // As per @Bergi's answer
            await doSomething(repoName, retries + 1); // Recursive call
        } else {
            console.log('Issue with repo: ', repoName);
            throw err; // (Or) return ''; based on requirement
        }
    }
}


// Filter out the valid results at the end - Applicable only if you use return '';
const results = await requestForEach(repos);
const finalResults = results.filter(Boolean);

This approach might help you to reduce the wait time in production over looping every request in sequence.

Prathap Reddy
  • 1,688
  • 2
  • 6
  • 18
  • 2
    Forgive my ignorance but would this stop the rate limiting issue? – newprogrammer Aug 16 '20 at 12:43
  • I have updated the question with the error :) . I need to either limit the rate of requests so I don't get the error in the first place, or perhaps pause then continue somehow after the limit has been hit. I am drawn to your answer as it does't use external libraries. – newprogrammer Aug 16 '20 at 13:05
  • 1
    Thanks for updating the question with error details. We might still hit the limit with `Promise.all` too. You can make use of `es6-promise-pool` as suggested for limiting the concurrency. It's simple to use and serve the purpose here. – Prathap Reddy Aug 16 '20 at 13:06
  • 1
    The promise-pool library looks good but Bergi's solution fitted my needs best. Thanks! – newprogrammer Aug 16 '20 at 17:12
  • Thanks @Bergi, for the knowledge share. I have updated my answer. Hope it helps in some other similar use cases :) – Prathap Reddy Aug 17 '20 at 08:03