I have run into an issue with using API Gateway as a proxy to S3 (for custom authentication), in that it does not handle binary data well (which is a known issue).
I'm usually uploading either .gz or .Z (Unix compress utility) files. As far as I understand it, the data is not maintained due to encoding issues. I can't seem to figure out a way to decode the data back to binary.
Original leading bytes: \x1f\x8b\x08\x08\xb99\xbeW\x00\x03
After passing through API GW: ��9�W�
... Followed by filename and the rest of the data.
One way of 'getting around this' is to specify Content-Encoding in the header of the PUT request to API GW as 'gzip'. This seems to force API GW to decompress the file before forwarding it to S3.
The same does not work for .Z files compressed with the Unix compress utility. Where you should specify the Content-Encoding as 'compress'.
Does anyone have any insight about what is happening to the data, to help shed some light on my issue? Also, does anyone know any possible work-around's to maintain the encoding of my data while passing through API GW (or to decode it once it's in S3)?
Obviously I could just access the S3 API directly (or have API GW return a pre-signed URL for accessing the S3 API), but there are a few reasons why I don't want to do that.
I should mention that I don't understand very much at all about encoding - sorry if there are some obvious answers to some of my questions.