While reading a ServletInputStream
my team was doing something like this:
br = new BufferedReader(new InputStreamReader(servletInputStream));
This unsurprisingly gave a red flag on my code analyzer as the encoding is not specified and so it will rely on whatever the default system encoding is.
The first step would be to try to get the encoding from the request:
String encoding = request.getCharacterEncoding();
if (encoding != null) {
br = new BufferedReader(new InputStreamReader(servletInputStream), encoding);
}
However, as this related answer told me, most browsers don't send the encoding, which will cause the encoding
to be null above. In that case, how on earth am I supposed to know what the encoding is?
Do browsers not send the encoding because:
- There is a universally-agreed default encoding for HTTP requests which is used if none was specified? (if so what is it and where is the standard that defines it should always be used), or,
- There is some other way to determine what the encoding is? (if so what is it? Surely not just trying different encodings and seeing whether you get garbage or something parseable?)