1

The objective is to print Uni Würzburg using C++

The code I am using:

#include <stdio.h>

using namespace std;

int main() {
    char str0[21] = "Uni Würzburg";
    printf("%s\n", str0);
    char str1[21] = {85,110,105,32,87,'\xc3','\xbc',114,122,98,117,114,103, 0};
    printf("%s\n", str1);
    char str2[20] = "Uni W\x81rzburg";
    printf("%s\n", str2);
    char str3[20] = {85,110,105,32,87,'\x81',114,122,98,117,114,103, 0};
    printf("%s\n", str3);
    return 0;
}

I got the \xc3bc from creating a "ü" string and printing the characters.

Output on two different Macs (using both CLion and in bash using g++ test.c -o test):

Uni Würzburg
Uni Würzburg
Uni W�rzburg
Uni W�rzburg

Output on Windows (CLion):

Uni W├╝rzburg
Uni W├╝rzburg
Uni Würzburg
Uni Würzburg

CLion editor and project encodings are in all cases set to UTF-8 and the locale of bash is:

LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=

Why is this happening? Most importantly: What would be a platform independent solution?

Hans
  • 2,354
  • 3
  • 25
  • 35
  • Here is the answer you are looking for: https://stackoverflow.com/a/402918/5470596 – YSC Nov 28 '17 at 11:57

1 Answers1

3

There unicode literals that can be used to ensure that your string is encoded as UTF-8:

u8"my_string"

On Linux these your normal strings will be already UTF-8.

On Windows it is really depending on your codpeage. And you may also supply additional compiler flag: /source-charset:utf-8

Note that even if your strings are encoded as UTF-8, cout, on Windows, that prints to console with non-unicode codepage will get you wrong output.

Douman Ashiya
  • 91
  • 1
  • 2