I have a question about Charset.forName(String charsetName). Is there a list of charsetNames I can refer to? For example, for UTF-8, we use "utf8" for the charsetName. What about WINDOWS-1252, GB18030, etc.?
-
2http://docs.oracle.com/javase/6/docs/technotes/guides/intl/encoding.doc.html and the latest http://download.java.net/jdk8/docs/technotes/guides/intl/encoding.doc.html – nullpotent Sep 23 '12 at 23:39
-
Also there is a good discussion at http://stackoverflow.com/questions/1684040/java-why-charset-names-are-not-constants – Guido Simone Sep 23 '12 at 23:44
4 Answers
Charset Description US-ASCII Seven-bit ASCII, a.k.a. ISO646-US, a.k.a. the Basic Latin block of the Unicode character set ISO-8859-1 ISO Latin Alphabet No. 1, a.k.a. ISO-LATIN-1 UTF-8 Eight-bit UCS Transformation Format UTF-16BE Sixteen-bit UCS Transformation Format, big-endian byte order UTF-16LE Sixteen-bit UCS Transformation Format, little-endian byte order UTF-16 Sixteen-bit UCS Transformation Format, byte order identified by an optional byte-order mark
Reference: http://docs.oracle.com/javase/7/docs/api/java/nio/charset/Charset.html

- 974
- 12
- 13
The charset names in Java are platform dependent, there are only 6 constants in the StandardCharsets class.
To view the all charsets you should look at IANA. Check Preferred MIME Name and aliases columns.

- 1,706
- 5
- 25
- 34
To list all character set installed in your JVM, you might use the following code snippet (Java 8 SE or higher):
SortedMap<String, Charset> map = Charset.availableCharsets();
map.keySet().stream().forEach(System.out::println);
On my system, this lists around 170 character sets.

- 21
- 1
The java Charset library is required to accept just a few basic encodings: ASCII, Latin-1 (ISO-8859-1), and a handful of UTF variants that you can see listed in this answer. That's a pretty useless list for any practical purposes, unless your scope is limited to Latin-1. In reality, Java classes can handle a large number of encodings that you can read about in the Supported Encodings page. Quoting from it:
The
java.io.InputStreamReader
,java.io.OutputStreamWriter
,java.lang.String
classes, and classes in thejava.nio.charset
package can convert between Unicode and a number of other character encodings. The supported encodings vary between different implementations of Java SE 8. The class description forjava.nio.charset.Charset
lists the encodings that any implementation of Java SE 8 is required to support.JDK 8 for all platforms (Solaris, Linux, and Microsoft Windows) and JRE 8 for Solaris and Linux support all encodings shown on this page. JRE 8 for Microsoft Windows may be installed as a complete international version or as a European languages version. [...]
The rest of the page consists of an extensive table of encoding names and synonyms, which is what the OP was after all those years ago...