1

I'm looking at using JSZip create a zip file on the file from various downstream files. Here is pseudo code for what I'm trying to do:

function* handler () {
    const ids = this.request.body.ids;

    const zip = new JSZip();
    for (let i = 0; i < ids; i++) {
        const r = yield request.get('/some/remote/service/' + ids[i]);
        zip.file(ids[i], r.body);
    }
    this.body = zip.nodeStream():
}  

But presumably this will require the contents of all the files to be in memory at once, and won't start streaming until all the files are downloaded.

I realize I could optimize download time by doing something like:

const allFiles = yield ids.map((id) => request.get('/some/remote/service/' + id));

for (let i = 0; i < ids.length; i++) zip.file(ids[i], allFiles[i]);

But mainly I'm hoping for a way to only hold one file in memory at a time and stream the result through zip back to the client. Is that possible with JSZip?

Kevin
  • 24,871
  • 19
  • 102
  • 158
  • Are you trying to stream files into a `.zip` file downloaded at client-side? – guest271314 Aug 29 '16 at 19:06
  • its a little more complicated than what I've represented here, but basically on the server side there might be N files behind a particular id, and the client should receive a single zip file containing all files. On the backend data store they aren't pre-zipped, so I'll be doing it on the fly. – Kevin Aug 29 '16 at 19:16
  • so to answer your question; yes, the client is giving an id and expecting a zip file back. – Kevin Aug 29 '16 at 19:22
  • See http://stackoverflow.com/questions/37176397/multiple-download-links-to-one-zip-file-before-download-javascript – guest271314 Aug 29 '16 at 19:26
  • I'm not sure if ZIP is a streaming format (where you can concatenate data to the output stream), which would be a requirement for what you're trying to do. – robertklep Aug 29 '16 at 19:52
  • The JSZip docs seem to indicate that it ...kind of is. See https://stuk.github.io/jszip/documentation/api_jszip/generate_async.html documentation on "streamFiles". In particular, ": in a zip file, the size and the crc32 of the content are placed before the actual content : to write it we must process the whole file.", but it also says "When this options is true, we stream the file and use data descriptors at the end of the entry. This option uses less memory but some program might not support data descriptors (and won’t accept the generated zip file)." – Kevin Aug 29 '16 at 21:33

1 Answers1

0

I'm hoping for a way to only hold one file in memory at a time and stream the result through zip back to the client. Is that possible with JSZip ?

Not without drawbacks. JSZip doesn't accept (yet) a generator or a function as the content of a file. It expects the full content of the file, a promise of it, or a stream. This means you cannot (yet) trigger the API calls at the last moment, when the content is actually needed.

When you add a stream (with zip.file("filename.txt", myStream)), JSZip will pause it and wait for something to use it (a call to zip.generateNodeStream() or a getter for example).

So what you can do (in the current v3.1.2), to limit the memory usage is to add all resources as stream and call zip.generateNodeStream({streamFiles:true}). The main downside is that you do open connections to your web services and you may hit a timeout (let say on the last file) while streaming the first files (or worse, depending on how .pause() is handled by the input stream).

Regarding the streamFiles and the comments of the question: the zip format is well suited for streaming. As said in the documentation, the streamFiles option decides where the size/crc32 of a file inside the zip file is written: before the data (but we need to temporarily hold the file in memory to compute them) or after the data (stream everything, put the computed values at the end).

David Duponchel
  • 3,959
  • 3
  • 28
  • 36