15

I have a device I need to download a file from. In certain cases, the file may have an incorrect content-encoding. Particularly, it may have a content-encoding of "gzip", when it is not gzipped, or compressed in any way.

So, when the file is gzipped, it's simple to get the content using a basic ajax GET:

$.ajax({
    url: 'http://' + IP + '/test.txt',
    type: 'GET'
})
.done(function(data) {
    alert(data);
});

But this fails, as you might expect, when the content-encoding is wrong.

To be clear, I'm not looking for a solution to bypass the ERR_CONTENT_DECODING_FAILED when simply navigating to the given url in a browser. I want to be able to load, for instance, a csv, into a string in javascript for further parsing.

Can I GET the file, and force it to skip attempting decoding, or override the content-encoding of the response, or some such?

femtoRgon
  • 32,893
  • 7
  • 60
  • 87
  • 2
    I would suggest trying to remove gzip from the Accept-Encoding header, but it looks like browsers won't let you do that for some reason. Only thing I can think of is a proxy where you have some server code make that request, which should give you more flexibility in how to build the request and process the response. – Joe Enos Mar 31 '15 at 18:54
  • Reasonable thought with the `Accept-Encoding`, but already gave it a go, and confirmed that it will *always* send back `Content-Encoding: gzip`, regardless. – femtoRgon Mar 31 '15 at 19:16
  • My only other thought then would be to either yell at (or buy cookies for) the people responsible for sending the wrong response header, and get them to fix it. – Joe Enos Mar 31 '15 at 19:28
  • Could always just create a server side solution to handle grabbing the text and return it. – EvilZebra Mar 31 '15 at 19:49
  • is a try-catch too simple to use here? i think that would work, and catch the expected error of when the encoding is incorrect. – Johnathan Ralls Apr 21 '15 at 15:32
  • might be able to use YQL to pull in the CSV, if it doesn't get choked up like the browser does... – dandavis Apr 23 '15 at 07:16

2 Answers2

7

This is simply not possible to do via client-side JavaScript, per the WHATWG's XHR spec, which makes use of the fetch operation from the WHATWG Fetch Standard.

Client-side scripts can only read the response object supplied by the browser environment. The Fetch Standard defines how the browser environment must build a response object's body attribute in step 2 of the fetch operation (note especially substeps 2 through 4):

  1. Whenever one or more bytes are transmitted, let bytes be the transmitted bytes and run these subsubsteps:

    1. Increase response's body's transmitted with bytes' length.

    2. Let codings be the result of parsing Content-Encoding in response's header list.

    3. Set bytes to the result of handling content codings given codings and bytes.

    4. Push bytes to response's body.

Where the action handling content codings is:

To handle content codings given codings and bytes, run these substeps:

  1. If codings are not supported, return bytes.

  2. Return the result of decoding bytes with the given codings as explained in HTTP.

From this definition, we can see that a response object never exposes encoded bytes in its body property. Before bytes can be added to the body, they must first be decoded. A client script never has access to what the spec calls "transmitted bytes" (i.e., the actual encoded bytes sent over the wire).

Decoding is determined exclusively by the Content-Encoding header. There is no mechanism by which client-side JavaScript can manipulate the response headers of a response object, so Content-Encoding must be whatever the server originally sent.

What your server is doing is wrong. Your only options are:

  1. Fix the behavior of the server.

  2. Run the HTTP response through a proxy that fixes the Content-Encoding response header before it reaches your client.

apsillers
  • 112,806
  • 17
  • 235
  • 239
2

In a modern browser-based environment, you can't alter the Accept-Encoding, thanks to the Same-Origin policy for HttpRequest:

Link to Google's explanation

For your brain-dead device, the best workaround is a server-side proxy that fetches the content and ignores the incorrect encoding, and then returns the results with a sane set of headers.

McCroskey
  • 1,091
  • 1
  • 11
  • 21