I'm using the code from this question to archive files using node-archiver and transfer them to S3. My specific task requires me to download a large number of files from URLs, zip them to one archive, and transfer them to S3.
I'm using the "got" Javascript library to do this.
for (const file of files) {
const { resultFileName, fileUrl} = getFileNameAndUrl(file);
if (!fileUrl)
continue;
const downloadStream = got.stream(fileUrl, {
retry: {
limit: 5
}
});
archive.append(downloadStream, { name: resultFileName });
}
The rest of the code is pretty much the same as in the original question. The issue is that script doesn't work well with a huge amount of files (it just finishes execution at some point).
In the perfect world - I want this script to download files, append them to archive and transfer them to S3 using pipes. And the best is to download them in batches (something like Promise.map with concurrency in bluebird). I just don't get how to do it with Streams, as I do have not much experience with them.