I am utterly confused by the answers that I have seen on stackoverflow plus on java docs
- https://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.1
- What is the character encoding of String in Java?
- https://docs.oracle.com/javase/7/docs/api/java/nio/charset/Charset.html
While all theory in the docs and stack in the links above seem to point that UTF-16 is the native character set supported by Java, there is another theory that says it depends on the JVM/OS e.g. in this link, it says:
Every instance of the Java virtual machine has a default charset, which may or may not be one of the standard charsets. The default charset is determined during virtual-machine startup and typically depends upon the locale and charset being used by the underlying operating system.
Then in the same link in another section it says
The native character encoding of the Java programming language is UTF-16.
I am finding it difficult to understand this apparently contradicting statements as:
- one says it is dependent on OS
- the other (I infer) says, regardless of the OS, UTF-16 is the charset for Java (This is also what all of the links I have mentioned above say)
Again, now, when I execute the following piece of code:
package org.sheel.classes;
import java.nio.charset.Charset;
public class Test {
public static void main(String[] args) {
System.out.println(Charset.defaultCharset());
}
}
...in an online editor I get to see UTF-8. In my local system I get to see windows-1252
And lastly, there is a JDK Enhancement Proposal (JEP) which talks about changing the default to UTF-8
Could there be an explanation for this confusion?