1

If I pass in a UTF-16 encoded file to the following code then will I get an UnsupportedEncodingException?

    try {
        BufferedReader br = new BufferedReader(new InputStreamReader(in, Charset.forName("UTF-8")));
        String ip;
        while ((ip = br.readLine()) != null){
            //do something
        }
    } catch (UnsupportedEncodingException use) { 
        //when can I expect an exception?
    }

I have tried this with a UTF-16 file but I am not getting any exception. The reader somehow tries to read all the characters which causes it to read more line than expected. For example in a sample file with 3 lines the reader reads 5 lines, 2 of which are empty lines.

Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
Syed Ali
  • 1,817
  • 2
  • 23
  • 44
  • 2
    `UnsupportedEncodingException` would through exception if that encoding is not **supported** . `UTF-8`,`UTF-16` are both supported & valid encodings. – Anirudha May 09 '14 at 11:24
  • 1
    If you want to detect encoding _errors_ you cannot use the "standard" Java classes; you have to go through a `CharsetDecoder`. See also the `CodingErrorAction` class: the default for all classes is to `CodingErrorAction.REPLACE` and not `REPORT` – fge May 09 '14 at 11:29
  • 1
    Note that curiously enough, _no class_ in the JDK apart from `CharsetDecoder` allows you to detect encoding errors... Not even a `Reader` class. If you want that you'd have to create your own `Reader` implementation! That kind of sucks. – fge May 09 '14 at 11:33
  • thanks a lot guys and yes +1 for figuring out that I am looking for CharsetDecoder – Syed Ali May 09 '14 at 11:39

1 Answers1

3

UnsupportedEncodingException is only thrown if the name of the charset you pass to the Charset.forName() is not supported. It does not relate to the content of the stream (the Exception is declared to be thrown by the Charset.forName() not by BufferedReader or InputStreamReader classes).

icza
  • 389,944
  • 63
  • 907
  • 827
  • I can accept this in 10min. So, what should I do to find out if I have got the correctly encoded stream? – Syed Ali May 09 '14 at 11:28
  • Detecting file content encoding: http://stackoverflow.com/questions/499010/java-how-to-determine-the-correct-charset-encoding-of-a-stream – icza May 09 '14 at 11:30