3

I came across this interesting header:

Content-Type: charset=utf-8

Set HTTP header to UTF-8 using PHP

The answerer says that this syntax is defined by RFC 2616, but I am not seeing it in the provided link. Is this valid syntax, and if so where specifically is this defined?

Community
  • 1
  • 1
Zombo
  • 1
  • 62
  • 391
  • 407

2 Answers2

3

The production in RFC 2616 for the Content-Type header is this:

Content-Type   = "Content-Type" ":" media-type

And the media-type production is this:

media-type     = type "/" subtype *( ";" parameter )
type           = token
subtype        = token

That says that while the parameter part (e.g., charset=utf-8 is optional, the type "/" subtype part is not—that is, a media type must have type followed by a slash followed by a subtype.

So Content-Type: charset=utf-8 isn’t valid syntax per that, and not specially defined anywhere else normatively/authoritatively to be either.

RFC 2616 is actually obsoleted by RFC 7231 and several other RFCs (the current HTTP RFCs).

But the corresponding parts of RFC 7231 define essentially the same productions for this case:

The production in RFC 7231 for the value of the Content-Type header is this:

Content-Type = media-type

And the media-type production is this:

media-type = type "/" subtype *( OWS ";" OWS parameter )
type       = token
subtype    = token

And no other spec obsoletes or supersedes that part—RFC 7231 remains authoritative on this.


Most programming languages have good media-type parsing libs for syntax checking; example:

npm install content-type
node -e "var ct = require('content-type'); ct.parse('charset=utf-8')"
=> TypeError: invalid media type
node -e "var ct = require('content-type'); ct.parse('image; charset=utf-8')"
=> TypeError: invalid media type
Community
  • 1
  • 1
sideshowbarker
  • 81,827
  • 26
  • 193
  • 197
  • Thanks. I did some testing, and while the type is certainly required, it appears the subtype is not required – Zombo Feb 02 '17 at 05:12
  • 1
    What did you test with? The subtype is required per the HTTP specs at least. – sideshowbarker Feb 02 '17 at 05:15
  • I am using the program MHonArc – Zombo Feb 02 '17 at 05:16
  • 1
    OK so I guess then MHonArc doesn’t require the subtype. But it’s certainly not safe to assume other tools don’t. Any tool that implements a content-type parser that conforms to the RFCs will likely fail to parse anything that doesn’t have a subtype (for one example, see the edit to my answer). – sideshowbarker Feb 02 '17 at 05:23
1

No, I cannot find such content-type defined anywhere in RFC 2616 or RFC 7231.

And it doesn't even work in Chrome.

(I tried xhr.setRequestHeader('Content-type','charset=utf-8');. When I xhr.send it there is no content-type header.)

cshu
  • 5,654
  • 28
  • 44