2

This page on Mozilla Developer Network, which is usually not too bad in quality, states:

* matches any content encoding not already listed in the header. This is the default value if the header is not present. It doesn't mean that any algorithm is supported; merely that no preference is expressed.

Now I found that Elasticsearch goes ahead and sends gzip when I tell it Accept-Encoding: * but plain data when I leave out the header.

It seems to me that this means that both sentences are wrong:

This is the default value if the header is not present.

In that case the behavior should be identical whether Accept-Encoding: * or no header at all is given.

It doesn't mean that any algorithm is supported; merely that no preference is expressed.

It seems that to Elasticsearch it means exactly that: It's fine to send gzip.

Am I misunderstanding what they mean in MDN? Is the information on that page simply wrong (it has en Edit button after all)? Or is Elasticsearch doing something it's not supposed to do?

AndreKR
  • 32,613
  • 18
  • 106
  • 168

1 Answers1

0

And what is the wrong behaviour here ?

Edit : the exact expected behaviour is defined in RFC 2616 (obsolete), section 14.3 https://www.rfc-editor.org/rfc/rfc2616#section-14.3 RFC 7231 https://www.rfc-editor.org/rfc/rfc7231#section-5.3.4

My understanding is that if you (the HTTP client) tell Elasticsearch that you can accept any content encoding, then the server is free to choose whatever encoding it prefers to send its data (whether it is plain text or gzip). Then, refer to the Content-Encoding header to be able to handle correctly the data.

Looking precisely at the 2 sentences :

This is the default value if the header is not present.

If the Content-Encoding header is not present, then it is equivalent as stating Content-Encoding = *. Which means that the server can use any content encoding it wishes. It does not mean that the server must always use the same encoding scheme : it means the server is free to choose the one it wants.

It doesn't mean that any algorithm is supported; merely that no preference is expressed.

This sentence applies to the client (not the server). When using *, the client just says to the server "oh, whatever encoding you will use, that's fine by me. Feel free to use any you want."

In both cases (no Accept-Encoding header or Accept-Encoding = *), plain text, gzip or any other encoding scheme is legitimate. As for the Elasticsearch implementation, my guess is the following :

  • As the server, if I receive no Accept-Encoding header I could assume that the client does not even know about content encoding. It is safer to use plain text.
  • As the server, if I receive a Accept-Encoding header, that means the client knows about content encoding and it is really willing to accept anything. Well, gzip is a good choice to spare bandwidth, and it is well supported.

Note that I am largely interpreting : only the answer of the original Elasticsearch developer would be accurate.

If you support a limited set of content encoding, you should not use *. You should better explicitly provide the encodings you support.

Community
  • 1
  • 1
Emmanuel Guiton
  • 1,315
  • 13
  • 24
  • According to that page a) the asterisk does *not* mean "I accept any content encoding" and b) the behavior should be the same as when I don't give the header at all. – AndreKR Feb 07 '17 at 07:54
  • a) Any content encoding that is not yet defined in the header : that is a fallback, like a default case in a switch. If you did not define any content type explicitly, then any other content type do mean any content type. b) I haven't read that the behaviour should be the same in both cases. That seems to be up to the server, which `selects one of the proposals, uses it and informs the client of its choice with the Content-Encoding response header` – Emmanuel Guiton Feb 07 '17 at 07:59
  • You can edit your answer, by the way, which is usually better than a long comment. Anyway, I don't quite get what you mean, can you edit your answer and refer to the two sentences separately? – AndreKR Feb 07 '17 at 08:42
  • So, as a server, when there is no Accept-Encoding header, I can just send gzip? Is there any server that actually does that? – AndreKR Feb 07 '17 at 09:22
  • That is my understanding of the Mozilla Developer Network page you provided. Note that the server must set the `Content-Encoding` header, so the client is not left without any mean to understand the response. About implementations, have a look at the apache [mod_deflate](https://httpd.apache.org/docs/current/mod/mod_deflate.html) : it enables gzip compression before sending the data to the client. Though I have not found precisely its behaviour when no `Accept-Encoding` header is set. By the way, I added a link to RFC 2616 in the answer, you should have a look at it. – Emmanuel Guiton Feb 07 '17 at 10:41
  • @EmmanuelGuiton While your answer is correct (safe for `Content-Encoding` where you probably mean `Accept-Encoding`), RFC 2616 is outdated since several years now. Please refer to [RFC 7231, sec. 5.3.4](https://tools.ietf.org/html/rfc7231#section-5.3.4) in this case. – DaSourcerer Feb 07 '17 at 13:25
  • Oops. You're right. I fixed that and I added the right RFC. Thanks. – Emmanuel Guiton Feb 07 '17 at 14:00
  • Cool. I can upvote your answer with clear conscience then ☺ – DaSourcerer Feb 07 '17 at 15:30