What encoding should I use for an HTTP servlet input stream if none was specified?

Question

While reading a ServletInputStream my team was doing something like this:

br = new BufferedReader(new InputStreamReader(servletInputStream));

This unsurprisingly gave a red flag on my code analyzer as the encoding is not specified and so it will rely on whatever the default system encoding is.

The first step would be to try to get the encoding from the request:

String encoding = request.getCharacterEncoding();
if (encoding != null) {
  br = new BufferedReader(new InputStreamReader(servletInputStream), encoding);
}

However, as this related answer told me, most browsers don't send the encoding, which will cause the encoding to be null above. In that case, how on earth am I supposed to know what the encoding is?

Do browsers not send the encoding because:

There is a universally-agreed default encoding for HTTP requests which is used if none was specified? (if so what is it and where is the standard that defines it should always be used), or,
There is some other way to determine what the encoding is? (if so what is it? Surely not just trying different encodings and seeing whether you get garbage or something parseable?)

@andrucz I am a server....I don't see the HTML content. I only see the request which came in, which includes a request body (and headers, URI etc). — Adam Burley, Feb 08 '16 at 19:40

score 3 · Answer 1 · answered Feb 08 '16 at 20:01

3

Since its a dynamic web application, you'd be expected to have some control over what is the encoding with clients post the request. Usually its UTF-8.

answered Feb 08 '16 at 20:01

Ravindra HV

2,558
1
17
26

What encoding should I use for an HTTP servlet input stream if none was specified?

1 Answers1