9

How to get console (cmd.exe in windows, linux shell, or eclipse console output) charset encoding? java.nio.charset.Charset.defaultCharset() seems to be only for input/output files, not console.

tmp
  • 165
  • 1
  • 5
  • BTW: the correct way to write to the console is via `System.console()` (which knows the encoding), however this does not work with Eclipse as it has a missing console emulation. – eckes Apr 30 '15 at 00:25

4 Answers4

4

There is no standardized way to get that information from the system. Usually it will be the platform default encoding, but as you've noticed, that's not necessarily the case (it's not documented, as far as I know).

You could go the ugly route and use reflection to find out which encoding Java uses. The following code is entirely un-portable and has only been verified to work on one specific version of the OpenJDK, it's experimentation and not meant for production:

final Class<? extends PrintStream> stdOutClass = System.out.getClass();
final Field charOutField = stdOutClass.getDeclaredField("charOut");
charOutField.setAccessible(true);
OutputStreamWriter o = (OutputStreamWriter) charOutField.get(System.out);
System.out.println(o.getEncoding());

This prints UTF8 [sic] on my system, which isn't surprising, as I'm using a UTF-8 locale on a Linux machine.

Joachim Sauer
  • 302,674
  • 57
  • 556
  • 614
  • Actually System.out gives you the IO encoding, but you can use a similar approach to read "cs" from System.console(): http://hg.openjdk.java.net/code-tools/jmh/file/0797222e066e/jmh-core/src/main/java/org/openjdk/jmh/util/Utils.java#l179 which is initialized by the launcher (for Windows it gives you the ANSI/UNICODE CS, I think for Linux it gives you the default charset, same as IO Charset) – eckes Apr 30 '15 at 00:14
  • @Joachim Sauer This works on Windows 7 HU, too where there's no correct way to set up the console (by using chcp). (Eg. "ping" only displays readable text with cp852, other commands like "help" work with both cp852 and cp1250, "wevtutil.exe qe System" only works with cp1250.) – T-Gergely Feb 24 '16 at 12:30
  • Amazingly enough, on the US version of Windows 10, this reports `Cp437`. Some things never change. – cayhorstmann May 07 '16 at 22:53
2

Since JDK 1.1, you can construct an OutputStreamWriter using the System out field, and call getEncoding().

OutputStreamWriter osw = new OutputStreamWriter(System.out);
System.out.println(osw.getEncoding());
ubzack
  • 1,878
  • 16
  • 16
  • 1
    If you don't explicitly provide a charset, osw.getEncoding() should just give you the platform default one. See for instance: https://github.com/dmlloyd/openjdk/blob/342a565a2da8abd69c4ab85e285bb5f03b48b2c9/src/java.base/share/classes/sun/nio/cs/StreamEncoder.java#L56 – Arjan Tijms Apr 05 '18 at 21:16
  • If I am not mistaken this will effectively just return `Charset.defaultCharset()` which may have been set via property `file.encoding`. – Robert Nov 19 '20 at 09:24
1

In general: you'd have to ask the shell what charset it is currently using to display characters.

Guessing not knowing: there is no standard way in Java as (I guess) there's no standard for consoles to report the actual charset. We'll have to detect the actual operating system or console provider (eclipse, ...) and use their specific functionalities to get the name of the actual charset.

Andreas Dolk
  • 113,398
  • 19
  • 180
  • 268
  • 4
    Actually Java detects the console encoding for System.console() but there is no exported functionality to retrieve it. JMH does an hack to get to it with reflection: http://hg.openjdk.java.net/code-tools/jmh/file/0797222e066e/jmh-core/src/main/java/org/openjdk/jmh/util/Utils.java#l179 – eckes Apr 30 '15 at 00:13
0

Probably you want to ask your IDE, assuming you're using one. If you're not, then it's whatever your shell is using. If you happen to be using eclipse, it will be the same as your "project characterset", which you can find/change by right-clicking on your project, properties, and Resource -> Text file encoding