I have learned that Windows uses UTF-16LE on x86/x64 systems. What about Linux? Which Unicode encoding does it use: UTF-16LE or UTF-32?
Asked
Active
Viewed 5,583 times
3
-
1What makes you think Linux favors any particular encoding? Are you asking whether common Linux distributions assume that configuration files are encoded using a particular encoding or whether syscalls assume that inputs are strings of code-points encoded using a particular encoding? – Mike Samuel Apr 07 '12 at 03:50
-
Why do you mention the processor architecture? Are you under the impression that the architecture for which you compile Linux affects the encoding beyond endianness? – Mike Samuel Apr 07 '12 at 03:52
-
1@Mike Samuel: I am asking which encoding do syscalls assume? – Jichao Apr 07 '12 at 03:52
-
2Somewhat related to [Why is it that UTF-8 encdoing is used when interacting with a Unix or Linux Environment](http://stackoverflow.com/questions/164430/why-is-it-that-utf-8-encoding-is-used-when-interacting-with-a-unix-linux-environ/). – Jonathan Leffler Apr 07 '12 at 03:59
-
1[UTF-32](http://www.unicode.org/faq//utf_bom.html) comes in BE and LE forms too. – Jonathan Leffler Apr 07 '12 at 04:09
2 Answers
4
http://www.xsquawkbox.net/xpsdk/mediawiki/Unicode says
Linux
On Linux, UTF8 is the 'native' encoding for all strings, and is the format accepted by system routines like
fopen()
.
so Linux is like Plan 9 in that respect, and boost::filesystem and Unicode under Linux and Windows notes
It looks to me like
boost::filesystem
under Linux does not provide a wide character string inpath::native()
, despiteboost::filesystem::path
having been initialized with a wide string.
which would rule out UTF-16 and UTF-32 since all variants of those require wide character support -- NUL bytes allowed inside strings.

Community
- 1
- 1

Mike Samuel
- 118,113
- 30
- 216
- 245
-
-
1@melab, [lwn](https://lwn.net/Articles/71472/) tends to agree that the kernel is charset-agnostic and treats paths as nul-terminated byte arrays. One ignores userspace conventions at one's peril though. – Mike Samuel Sep 19 '17 at 21:25