0

I have a function that uses Axios to download a zip file and extract the file into a temporary directory. The process itself works as intended, but I'm having difficulty awaiting the final result before proceeding. I'll admit that I don't fully understand how to use promises, but that's what I need help learning.

Here is the complete code:

const axios = require('axios');
const StreamZip = require('node-stream-zip');

// Pipedream: steps.trigger.raw_event.body.result_set.download_links.json.all_pages
// Testing: https://api.countdownapi.com/download/results/04_NOVEMBER_2021/1900/Collection_Results_F4C0B671_51_All_Pages.zip
const all_pages = 'https://api.countdownapi.com/download/results/04_NOVEMBER_2021/1900/Collection_Results_F4C0B671_51_All_Pages.zip';
let fileName = 'all_pages.zip';

async function asyncFunc() {
    return await axios.get(all_pages, {responseType: "stream"})
        .then(res => {
            console.log("Waiting ...")

            if (res.status === 200) {
                const path = require("path");
                const SUB_FOLDER = "";
                fileName = fileName || all_pages.split("/").pop();

                const dir = path.resolve(__dirname, SUB_FOLDER, fileName);
                res.data.pipe(fs.createWriteStream(dir));
                res.data.on("end", () => {
                    console.log("Download Completed");

                    const zip = new StreamZip({
                        file: dir,
                        storeEntries: true
                    });
                    zip.on('error', function (err) {
                        console.error('[ERROR]', err);
                    });
                    zip.on('ready', function () {
                        console.log('All entries read: ' + zip.entriesCount);
                        // console.log(zip.entries());
                    });
                    zip.on('entry', function (entry) {
                        const pathname = path.resolve('./tmp', entry.name);
                        if (/\.\./.test(path.relative('./tmp', pathname))) {
                            console.warn("[zip warn]: ignoring maliciously crafted paths in zip file:", entry.name);
                            return;
                        }

                        if ('/' === entry.name[entry.name.length - 1]) {
                            console.log('[DIR]', entry.name);
                            return;
                        }

                        console.log('[FILE]', entry.name);
                        zip.stream(entry.name, function (err, stream) {
                            if (err) {
                                console.error('Error:', err.toString());
                                return;
                            }

                            stream.on('error', function (err) {
                                console.log('[ERROR]', err);
                            });

                            // example: print contents to screen
                            // stream.pipe(process.stdout);

                            // example: save contents to file
                            fs.mkdir(path.dirname(pathname), {recursive: true}, function () {
                                    stream.pipe(fs.createWriteStream(pathname));
                                }
                            );
                        });
                    });
                });
            } else {
                console.log(`ERROR >> ${res.status}`);
            }
        })
        .catch(err => {
            console.log("Error ", err);
        });
}

(async () => {
    try {
        await asyncFunc();
        console.log('Finished')
    } catch (error) {
        console.error(error);
    }
})();

As I said, the code itself works in that it'll download the zip file and extract the contents—however, my test console.log('Finished') fires just after the Axios get. Here are the results of the order of operations:

Waiting ...
Finished
Download Completed
[FILE] Collection_Results_F4C0B671_51_Page_1.json
[FILE] Collection_Results_F4C0B671_51_Page_2.json
[FILE] Collection_Results_F4C0B671_51_Page_3.json
[FILE] Collection_Results_F4C0B671_51_Page_4.json
[FILE] Collection_Results_F4C0B671_51_Page_5.json
[FILE] Collection_Results_F4C0B671_51_Page_6.json
[FILE] Collection_Results_F4C0B671_51_Page_7.json
All entries read: 7

I've tried reading other articles on Promises and similar questions, and I've tried many options without any luck.

halfer
  • 19,824
  • 17
  • 99
  • 186
Keith Petrillo
  • 125
  • 2
  • 10
  • `.on(...)` also do things async – apple apple Nov 07 '21 at 16:27
  • In your `end` event handler, you switch from promise style to callback style. The promise fulfills as soon as the `then` callback returns - synchronously. Look into how you can promisify those zip streams. – Bergi Nov 07 '21 at 16:51

2 Answers2

0

A major advantage of using Async/Await is that you can avoid deeply nested, difficult to read code - such as yours. It makes much more sense to break this code into functional units. Rather than thinking about all this code as "must be together", think "works better when apart".

So the entry point can call axios, use .then() to fire off the data file download, use .then() to fire off unzipping, use then() to fire off stream writing function.

You have created a dilemma by using the callback version of StreamZip. It would simplify things a lot if you used the Promise version the API.

Something like the following is easier to rationalize about the order of operation.

  try {
    console.log('Starting')
    axios.get(all_pages, {responseType: "stream"})
      .then(download)
      .then(unzip)
      .then(writeFile)
    console.log('Finished')
  } catch (error) {
    console.error(error);
  }
Randy Casburn
  • 13,840
  • 1
  • 16
  • 31
  • Does `StreamZip` have a promise version of the API? Notice also the the OP writes not only a single file, but multiple ones in parallel. – Bergi Nov 07 '21 at 17:17
  • 1
    Yes on [promise version](https://github.com/antelle/node-stream-zip#async-api). Unzip and write could be done in one step with `.extract()` method of the API. – Randy Casburn Nov 07 '21 at 17:22
0

If you want the Finished statement to show up after all the entries are read, why not just add it to this section of the code?

                    zip.on('ready', function () {
                        console.log('All entries read: ' + zip.entriesCount);
                        // console.log(zip.entries());
                        // ADD THE FINISHED STATEMENT HERE
                    });

Edit

Base on the docs you can do the following after the end of the stream.

const stm = await zip.stream('path/inside/zip.txt');
stm.on('end', () => {
    zip.close();
    // FINISHED AT THIS POINT ?

})

This is another place where you can say you are done streaming (Finished). Depending on the usage you may not have to close the zip here.

Zephyr
  • 1,612
  • 2
  • 13
  • 37
  • That doesn't solve my problem in that the code block itself continues. I'm using this on Pipedream so there are other things continuing after my function. – Keith Petrillo Nov 07 '21 at 17:57
  • So when does the finished statement need to show up? After everything is executed on ready? Or after everything is executed after on end? The section being awaited will not wait for the execution of the code blocks inside StreamZip. – Zephyr Nov 07 '21 at 18:08