I'm trying to download a report from an API, but the way the data is being sent back is with a zipped folder with another folder inside of it and with a dozen or so zipped JSON files within that folder. For clarity, it looks like this:
report.zip/
├── reportID/ <- this is a regular folder
│ ├── reportID_123.json.gz
│ ├── reportID_456.json.gz
│ ├── reportID_789.json.gz
│ └── reportID_159.json.gz
I'm trying to unzip the first folder, then unzip each individual file, and finally loop through and read the contents of each JSON file and add them to a single object. But I'm having two issues.
The first is that while the first part of this code works when unzipping the first folder and extracting the name of each JSON file, and the second part works when unzipping them, each unzipped file is exactly the same as the one before (which isn't the case when doing the whole thing manually).
var zip = new AdmZip(tempFilePath);
var zipEntries = zip.getEntries(); // an array of ZipEntry records
const allZips = [];
const tempOutputPath = path.join(os.tmpdir(), 'output');
zip.extractAllTo(/*target path*/tempOutputPath, /*overwrite*/true);
zipEntries.forEach(function(zipEntry) {
allZips.push(zipEntry.entryName);
});
console.log (allZips);
const allData = [];
for (var i = 0; i <= allZips.length; i++) {
const zippedFileName = path.join(tempOutputPath, allZips[i]);
const finalOutputName = path.join(tempOutputPath, allZips[i].replace('.gz', ''));
console.log(zippedFileName);
const inp = fs.createReadStream(zippedFileName);
const out = fs.createWriteStream(finalOutputName);
inp.pipe(unzip).pipe(out);
console.log('File piped successfully');
console.log(finalOutputName);
let rawData = fs.readFileSync(finalOutputName);
let data = rawData.toString();
console.log(data);
allData.push(data);
}
The second problem is that even while looping through, it only manages to actually extract the data from some of the files, seemingly at random considering the code and the files are all identical apart from name. It might have something to do with the fact that as soon as the loop finishes, I get the following errors, despite the end of the loop also being the end of my code:
(node:15772) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 end listeners added. Use emitter.setMaxListeners() to increase limit
(node:15772) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 unpipe listeners added. Use emitter.setMaxListeners() to increase limit
(node:15772) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 drain listeners added. Use emitter.setMaxListeners() to increase limit
(node:15772) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 error listeners added. Use emitter.setMaxListeners() to increase limit
(node:15772) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 close listeners added. Use emitter.setMaxListeners() to increase limit
(node:15772) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 finish listeners added. Use emitter.setMaxListeners() to increase limit
(node:15772) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 data listeners added. Use emitter.setMaxListeners() to increase limit
(node:15772) UnhandledPromiseRejectionWarning: TypeError [ERR_INVALID_ARG_TYPE]: The "path" argument must be of type string. Received type undefined
One last thing in case it's relevant, this all seems to work better (although still not perfectly) when hardcoding the file names instead of using temporary file paths. Unfortunately I have to use temp file paths as this is for a Cloud Function that doesn't allow regular paths.