I feel a bit of a chicken and egg problem if i write a html meta tag specifying charset as say UTF-16
- like how do we decode the entire HTTP Request in the first place if we didn't know its UTF-16 data
? I believe request header needs to handle this and by the time we try to read metadata like say html tag charset="utf-16"
we already know its UTF-16
.
Besides think one level higher about header information like Request Headers - are passed in ASCII as a standard ?
I mean at some level we need to agree upon and you can't set a data that is needed to decode as a metadata information . Can anyone clarify this ? I am a bit confused on the idea of specifying a data that is needed to interpret the whole data as a metadata information inside the original data .
In general how can any form of encoding work if we don't have a standard agreed upon language/encoding to convey the data about the data itself ?
For example I am informed that Apache default has 8859-1
as the standard . So would all client need to enforce that for HTTP Headers and interpret the real content as UTF-8
if we want UTF-8
for the content-type
?
What character encoding should I use for a HTTP header? is a closely related question