2

I am trying to download pdf files from a URL and save them to my local disk. The issue is that I need each file downloaded one at a time with a time delay in between (the delay should be about 1 second). That is to say: after one file is downloaded, the script should wait a full second before downloading the next file).

  1. Wait for the file to be downloaded
  2. Wait for the file to be saved to disk
  3. Wait a full second
  4. Start that loop over again

The Code:

const fs = require('fs');
const fetch = require('node-fetch');

const arrayOfURL = ['www.link.com/file1', 'www.link.com/file2', 'www.link.com/file3'];

const wait = waitTime => new Promise(resolve => setTimeout(resolve, waitTime));

(async() => {

async function downloadAllFiles() {
     return new Promise((resolve, reject) => {
        (async() => {
          for (let i = 0; i < arrayOfURL.length; i++) {

            console.log(`File ${i} Download Begun`);

            const response = await fetch(arrayOfURL[i]);
            const writeStream = fs.createWriteStream(i + '.pdf');
            response.body.pipe(writeStream);

            async function write() {
                 return new Promise((resolve, reject) => {
                      writeStream.on('finish', function () {
                           resolve('complete');
                      });
                 });
            }

            await write();

            console.log(`----- DOWNLOAD COMPLETE -----`);

            await wait(1000 * i);

            if (i === 2) {
                 resolve('complete');
            }
        } 
      })();
  });
}

await downloadAllFiles();

})();

What I want the logs is to show this:

console.log(`File 0 Download Begun`);
console.log(`File Saved Locally`);
console.log(`----- DOWNLOAD COMPLETE -----`);
***Wait 1 second***

console.log(`File 1 Download Begun`);
console.log(`File Saved Locally`);
console.log(`----- DOWNLOAD COMPLETE -----`);
***Wait 1 second***

console.log(`File 2 Download Begun`);
console.log(`File Saved Locally`);
console.log(`----- DOWNLOAD COMPLETE -----`);
***Wait 1 second***

Instead the logs show this:

console.log(`File 0 Download Begun`);
console.log(`File 1 Download Begun`);
console.log(`File 2 Download Begun`);

console.log(`File Saved Locally`);
console.log(`----- DOWNLOAD COMPLETE -----`);

console.log(`File Saved Locally`);
console.log(`----- DOWNLOAD COMPLETE -----`);

console.log(`File Saved Locally`);
console.log(`----- DOWNLOAD COMPLETE -----`);
***Downloaded at the same time***

And the delay is also not working (all files are downloaded at once) as i in await wait(1000 * i) is always 0;

Any help is appreciated.

MonkeyOnARock
  • 2,151
  • 7
  • 29
  • 53
  • 2
    This code the way you have it shown will not even run. You're trying to use an `await` inside a plain (non-async) function declared here `new Promise((resolve, reject) => { ...})`. That callback function is NOT `async` and thus your `await` will cause an error. So, what you're showing here is NOT what generated the logs you show because this won't even run. Please show your ACTUAL code that generated the logs. – jfriend00 Dec 30 '21 at 22:39
  • async iife before the loop has been added. – MonkeyOnARock Dec 30 '21 at 22:48
  • 1
    Note also that your description says you want to wait 1 second, but you're doing `await wait(1000 * i);` which is an increasing amount of wait for each iteration, not a constant 1 second wait. It seems to me that you can now remove that wait entirely since the asynchronous problems with the loop are now corrected. – jfriend00 Dec 30 '21 at 23:08
  • Yes, that is wrong. It should just be `await wait(1000)`. – MonkeyOnARock Dec 30 '21 at 23:13
  • @MonkeyOnARock Are you sure the AIIFE is around the loop, not inside the loop? Because it would work if it was outside the loop. – Bergi Dec 31 '21 at 00:39

1 Answers1

3

The async IIFE is your problem. That IIFE returns a promise that you are doing nothing with, thus the for loop just keeps running because nothing is awaiting that promise that the IIFE returns.

It would be better to fix your code by just removing both IIFEs and removing the new Promise() wrapper entirely like this:

const fs = require('fs');
const fetch = require('node-fetch');

const arrayOfURL = ['www.link.com/file1', 'www.link.com/file2', 'www.link.com/file3'];

const wait = waitTime => new Promise(resolve => setTimeout(resolve, waitTime));

async function downloadAllFiles() {
    for (let i = 0; i < arrayOfURL.length; i++) {

        console.log(`File ${i} Download Begun`);

        const response = await fetch(arrayOfURL[i]);
        const writeStream = fs.createWriteStream(i + '.pdf');
        response.body.pipe(writeStream);

        async function write() {
            return new Promise((resolve, reject) => {
                writeStream.on('finish', function() {
                    resolve('complete');
                });
                writeStream.on('error', reject);
            });
        }

        await write();

        console.log(`----- DOWNLOAD COMPLETE -----`);

        await wait(1000);
    }
}

downloadAllFiles().then(() => {
    console.log("done");
}).catch(err => {
    console.log(err);
});

And note, I also added an error handler for the writeStream.

When I substitute some real URLs and run this on my system, I get this output and there is a noticeable pause between downloads:

File 0 Download Begun
----- DOWNLOAD COMPLETE -----
File 1 Download Begun
----- DOWNLOAD COMPLETE -----
File 2 Download Begun
----- DOWNLOAD COMPLETE -----
done

async IIFEs can very easily get you in trouble. They return a promise (ALL async functions return a promise) so if you're not paying attention to that promise, then nobody waits for them to finish before continuing on with your code. I never use an async IIFE as there has always (for me) been a cleaner way to write my code.

Also, why are you pausing between loop iterations. That feels like it was an attempt to hack/fix some concurrency issue that should no longer be there or should be fixed a more fundamental way.

jfriend00
  • 683,504
  • 96
  • 985
  • 979
  • First, thank you for your help. I was needlessly getting lost in the extra async iife's. As for the pauses: tbh I was just worried about overloading the server I'm downloading from if I were to ask for all links at once (my example uses 3 links, but sometimes there may be dozens or hundreds of links). – MonkeyOnARock Dec 30 '21 at 23:09
  • 1
    @MonkeyOnARock - You will still be doing only one read at a time from the server so you won't overload the server. If they implement some sort of rating limiting, then you should probably use a smarter algorithm based on what you learn from experiments with their rate limit. See [this `rateLimitMap()` solution](https://stackoverflow.com/questions/36730745/choose-proper-async-method-for-batch-processing-for-max-requests-sec/36736593#36736593) for doing batch processing of an array while staying under some number of requests per second. – jfriend00 Dec 30 '21 at 23:13