The character set of a www form POST is always ASCII due to the embedded percent encoding, so charset
declaration for application/x-www-form-urlencoded
is unnecessary. In fact, specifying a charset for this MIME type is invalid.
So to get from:
0x6b65793d76254333254134254333254241254333254142
Into:
key=v%C3%A4%C3%BA%C3%A
Using virtually any encoding will work the same because of ASCII compatibility.
You may notice the data is still encoded. The charset
parameter of a request Content-Type
only applies to the immediate binaries sent ("converting a sequence of octets into a sequence of characters" as they say in the specs), not to the mechanism used in turning key=v%C3%A4%C3%BA%C3%A
into key=väúë
, which actually involves converting characters into other characters.
The application/x-www-form-urlencoded scheme "specification" in html4 is pretty useless, but html 5 actually tries. The ultimate default encoding of percent-encoding is UTF-8 with the encoding name transferred in the _charset_
magic parameter if available.
So yeah, there still isn't a good and used formal way (and charset
in the Content-Type is just invalid, wrong and misunderstood) to declare the character encoding for the embedded percent-encoding. In practice I would just use UTF-8 and as it's a very strict scheme, fall back to ISO-8859-1 when that fails because you can always go back from ISO-8859-1.
For JSON, using any other encoding outside UTF-8/16/32 is invalid with UTF-8 being assumed everywhere. For XML, you can read the Content-Type header, fallback to encoding
attribute and ultimately you have to fallback to UTF-8 and declare invalid if it doesn't compute.