I can't use the InputStreaReader option, because the file has to be read with Latin1.
And
I have this text file which might contain some unsupported characters in the Latin1 character set ...
You have contradictory requirements here.
Either the file is LATIN-1 (and there no "unsupported characters") or it is not LATIN-1. If it is not LATIN-1, you should be trying to find out what character set / encoding it really is, and use that one instead of LATIN-1 to read the file.
As other answers / comments have explained, you can either change the JVM's default character set, or specify a character set explicitly when you open the Reader
.
I'm having trouble setting the default character set of my JVM .
Please explain what you are trying and what problems you are having.
(and was a bit afraid of messing it up!)
COWARD! :-)
FWIW - if you try to read a data stream in (say) LATIN-1 and the data stream is not actually in LATIN-1, then you can expect the following:
- Characters that encode the same in LATIN-1 and the actual character set will be passed unharmed.
- Characters that don't encode the same, will either be replaced by a character that means "unknown character" (e.g. a question mark), or will be garbled. Which happens will depend on whether that byte or byte sequence at issue encodes a valid (but wrong) character, or no character at all.
The net result will be partially garbled text. The garbling may or may not be reversible, depending on exactly what the real character set and characters are. But it is best to avoid "going there" ... by using the RIGHT character set to decode in the first instance.