1

I am building a backend to handle pulling data from a third party API. There are three large steps to this, which are:

  1. Delete the existing db data (before any new data is inserted)
  2. Get a new dataset from the API
  3. Insert that data.

Each of these three steps must happen for a variety of datasets - i.e. clients, appointments, products etc.

To handle this, I have three Promise.all functions, and each of these are being passed individual async functions for handling the deleting, getting, and finally inserting of the data. I have this code working just for clients so far.

What I'm now trying to do is limit the API calls, as the API I am pulling data from can only accept up to 200 calls per minute. To quickly test the rate limiting functionality in code I have set it to a max of 5 api calls per 10 seconds, so I can see if it's working properly.

This is the code I have so far - note I have replaced the name of the system in the code with 'System'. I have not included all code as there's a lot of data that is being iterated through further down.

let patientsCombinedData = [];
      let overallAPICallCount = 0;
      let maxAPICallsPerMinute = 5;
      let startTime, endTime, timeDiff, secondsElapsed;

      const queryString = `UPDATE System SET ${migration_status_column} = 'In Progress' WHERE uid = '${uid}'`;

      migrationDB.query(queryString, (err, res) => {
        async function deleteSystemData() {
          async function deleteSystemPatients() {
            return (result = await migrationDB.query("DELETE FROM System_patients WHERE id_System_account = ($1) AND migration_type = ($2)", [
              System_account_id,
              migrationType,
            ]));
          }

          await Promise.all([deleteSystemPatients()]).then(() => {
            startTime = new Date(); // Initialise timer before kicking off API calls

            async function sleep(ms) {
              return new Promise((resolve) => setTimeout(resolve, ms));
            }

            async function getSystemAPIData() {
              async function getSystemPatients() {
                endTime = new Date();
                timeDiff = endTime - startTime;
                timeDiff /= 1000;
                secondsElapsed = Math.round(timeDiff);

                if (secondsElapsed < 10) {
                  if (overallAPICallCount > maxAPICallsPerMinute) {

                    // Here I want to sleep for one second, then check again as the timer may have passed 10 seconds
                    getSystemPatients();
 
                  } else {
                    // Proceed with calls
                    dataInstance = await axios.get(`${patientsPage}`, {
                      headers: {
                        Authorization: completeBase64String,
                        Accept: "application/json",
                        "User-Agent": "TEST_API (email@email.com)",
                      },
                    });

                    dataInstance.data.patients.forEach((data) => {
                      patientsCombinedData.push(data);
                    });
                    overallAPICallCount++;
                    console.log(`Count is: ${overallAPICallCount}. Seconds are: ${secondsElapsed}. URL is: ${dataInstance.data.links.self}`);

                    if (dataInstance.data.links.next) {
                      patientsPage = dataInstance.data.links.next;
                      await getSystemPatients();
                    } else {
                      console.log("Finished Getting Clients.");
                      return;
                    }
                  }
                } else {
                  console.log(`Timer reset! Now proceed with API calls`);
                  startTime = new Date();
                  overallAPICallCount = 0;
                  getSystemPatients();
                }
              }

              await Promise.all([getSystemPatients()]).then((response) => {
                async function insertSystemData() {
                  async function insertClinkoPatients() {
                    const SystemPatients = patientsCombinedData;

Just under where it says ' if (secondsElapsed < 10) ' is where I want to check the code every second to see if the timer has passed 10 seconds, in which case the timer and the count will be reset, so I can then start counting again over the next 10 seconds. Currently the recursive function is running so often that an error displayed related to the call stack. I have tried to add a variety of async timer functions here but every time the function is returned it causes the parent promise to finish executing.

Hope that makes sense

Matt Berry
  • 101
  • 1
  • 10
  • `function sleep` doesn't need `async` since a) returns a Promise and b) never uses `await` - but this isn't your issue – Bravo Mar 07 '22 at 04:14
  • Not a solution to your exact example but I wrote a class to handle just this. https://gist.github.com/nedjs/50c68bd70003e8b5dc91c03d7282d2a7 – ug_ Mar 07 '22 at 04:17
  • Here's a function to implement rate limiting on making calls to a specific target: [Proper async method for max requests per second](https://stackoverflow.com/questions/36730745/choose-proper-async-method-for-batch-processing-for-max-requests-sec/36736593#36736593) – jfriend00 Mar 07 '22 at 04:23
  • to be honest `Promise.all([deleteSystemPatients()]).then` is just `deleteSystemPatients().then` if `deleteSystemPatients` returns a Promise ... if it doesn't then there's no point to any of that line at all – Bravo Mar 07 '22 at 04:23
  • same goes for `Promise.all([getSystemPatients()]).then((response)` ... with the added note that `response` won't be anything since `getSystemPatients` doesn't ever return anything, ever - I think you perhaps misunderstand what `Promise.all` is used for – Bravo Mar 07 '22 at 04:26
  • I will have a look at the above two links soon, but just to clarify re the Promise.all, that is not the final code. I am going to have Promise.all([deleteSystemPatients(), deleteSystemAppointments(), deleteSystemProducts()]) etc, but at the moment I'm just up to the patients portion. Same goes for the get Promise.all – Matt Berry Mar 07 '22 at 04:29
  • Fair enough, but you aren't using the result in the first .then, and you are using a result in the second .then which will be undefined ... guess you have more code to write :p – Bravo Mar 07 '22 at 04:30
  • I'd strongly suggest you try and move this rate limit feature into its own thing instead of interwoven with your current code. This is a good generic problem that can be tested independently of all your database code. – Evert Mar 07 '22 at 04:32
  • 1
    @Bravo - There are many ways to approach rate limiting. The [code](https://stackoverflow.com/questions/36730745/choose-proper-async-method-for-batch-processing-for-max-requests-sec/36736593#36736593) I linked to actually measures when requests are sent and calculates when its safe to send the next request based on the rate limit (requests/sec) you're trying to stay under. I'm not sure why you're dissing that approach with your sarcastic tone. It's a prebuilt and fully tested approach. Available if it suits the OP's needs. Nothing more. – jfriend00 Mar 07 '22 at 05:05
  • I better remove my comments before a moderator also chooses to misunderstand - there you go @jfriend00 **offensive comment** removed – Bravo Mar 07 '22 at 05:13
  • @evert I am a noob but am not sure how I'd move the rate limiting outside of code that it's currently in. This is code to migration from system to another, by pulling a large dataset. Will think more about this though, maybe I can come up with something. Thanks – Matt Berry Mar 07 '22 at 05:14
  • Perhaps a function that can take a list of functions, and will only call n in parallel? – Evert Mar 07 '22 at 05:49

1 Answers1

0

I ended up using the Bottleneck library, which made it very easy to implement rate limiting.

const Bottleneck = require("bottleneck/es5");
      const limiter = new Bottleneck({
        minTime: 350
      });
await limiter.schedule(() => getSystemPatients());
Matt Berry
  • 101
  • 1
  • 10