47

What is the purpose of the system property sun.jnu.encoding? Various fragments on the web set or report it, but I can't find a definition.

Martin Geisler
  • 72,968
  • 25
  • 171
  • 229

4 Answers4

31

see http://happygiraffe.net/blog/2009/09/24/java-platform-encoding/

This link describes the usage of the sun.jnu.encoding property for using the correct encoding when parsing values passed via the commandline, something that setting the file.encoding property doesn't influence.

Geert-Jan Hut
  • 140
  • 2
  • 5
asdfasdg
  • 311
  • 3
  • 3
  • Archive link: https://web.archive.org/web/20200202094903/http://happygiraffe.net/blog/2009/09/24/java-platform-encoding/ – Maxime Kjaer Sep 01 '20 at 09:51
16

I did some investigation on encoding and as per my analysis

  1. sun.jnu.encoding affects the creation of file name (this possibly gets set by LANG on unix before starting java)
  2. file.encoding affects the content of a file
siddagrl
  • 371
  • 3
  • 4
  • 3
    sun.jnu.encoding is referenced in several places in the OpenJDK 8 sources. It does seem related to the encoding of pathnames. From my experiments, setting it with `-D` was ineffective; I had to actually configure the right locale for the whole system and process, as in https://stackoverflow.com/a/38553499/12916. – Jesse Glick Apr 04 '18 at 15:23
8

I tried to centralize all the information provided in the answers here and elsewhere on the Web, in order to make the most complete answer possible.

As other comments have noted, there are actually two properties that affect the chosen encoding on the JVM:

  • sun.jnu.encoding, also known as the "platform encoding" or "JNU encoding", is an undocumented, internal property that holds the name of the encoding to use for interacting with the platform (e.g. file paths and JNI C String to Java String conversions — maybe also command-line arguments, main classes and environment variables, but I wasn't able to verify this claim).

    On MacOS, this is always UTF-8, on Linux it's always the same as file.encoding (unless file.encoding is overriden, in which case I do not know what happens), and on Windows it can vary.

  • file.encoding, also known as the "default charset" or "user encoding", is mainly used to determine the charset for encoding/decoding file contents. This is the charset that java.nio.charsets.Charset.defaultCharset() returns. Note that the value in file.encoding is used by many JDK APIs as the default encoding, but can be overridden by providing an explicit Charset or a charset name in the call to the JDK method.

These properties are determined dynamically when the JVM starts (though this is not the case for GraalVM Native Image, which sets them at build time as of this writing).

Finally, as this draft JEP states:

The value of these system properties can be overridden on the command line although doing so has never been supported.

Maxime Kjaer
  • 762
  • 7
  • 13
  • I wasn't able to find what the "jnu" in `sun.jnu.encoding` is, but if anyone knows, I'm curious about it! – Maxime Kjaer Sep 01 '20 at 12:54
  • 5
    `JNU_` is the prefix of internally used functions or macros in the JVM native code. According to the only (old) document I found, “JNU” stands for “JNI utilities”. – Holger Sep 16 '20 at 12:42
2

I believe that this value represents the system encoding, which may be different from the user encoding ("file.encoding") on some platforms. The "sun" prefix makes me suspect that this is an implementation detail specific to the Sun JRE (a quick look at an IBM 1.4 VM shows an "ibm.system.encoding" system property). I have no idea on how this might be used internally - though I'm sure a quick look through the source would yield some clues.

McDowell
  • 107,573
  • 31
  • 204
  • 267