I am trying to run what I expect is a very common use case:
I need to download a gzip file (of complex JSON datasets) from Amazon S3, and decompress(gunzip) it in Javascript. I have everything working correctly except the final 'inflate' step.
I am using Amazon Gateway, and have confirmed that the Gateway is properly transferring the compressed file (used Curl and 7-zip to verify the resulting data is coming out of the API). Unfortunately, when I try to inflate the data in Javascript with Pako, I am getting errors.
Here is my code (note: response.data is the binary data transferred from AWS):
apigClient.dataGet(params, {}, {})
.then( (response) => {
console.log(response); //shows response including header and data
const result = pako.inflate(new Uint8Array(response.data), { to: 'string' });
// ERROR HERE: 'buffer error'
}).catch ( (itemGetError) => {
console.log(itemGetError);
});
Also tried a version to do it splitting the binary data input into an array by adding the following before the inflate:
const charData = response.data.split('').map(function(x){return x.charCodeAt(0); });
const binData = new Uint8Array(charData);
const result = pako.inflate(binData, { to: 'string' });
//ERROR: incorrect header check
I suspect I have some sort of issue with the encoding of the data and I am not getting it into the proper format for Uint8Array to be meaningful.
Can anyone point me in the right direction to get this working?
For clarity:
- As the code above is listed, I get a buffer error. If I drop the Uint8Array, and just try to process 'result.data' I get the error: 'incorrect header check', which is what makes me suspect that it is the encoding/format of my data which is the issue.
The original file was compressed in Java using GZIPOutputStream with UTF-8 and then stored as a static file (i.e. randomname.gz).
The file is transferred through the AWS Gateway as binary, so it is exactly the same coming out as the original file, so 'curl --output filename.gz {URLtoS3Gateway}' === downloaded file from S3.
I had the same basic issue when I used the gateway to encode the binary data as 'base64', but did not try a whole lot around that effort, as it seems easier to work with the "real" binary data than to add the base64 encode/decode in the middle. If that is a needed step, I can add it back in.
I have also tried some of the example processing found halfway through this issue: https://github.com/nodeca/pako/issues/15, but that didn't help (I might be misunderstanding the binary format v. array v base64).