I want to print blå
using UTF-8 but I do not know how to do it. UTF-8 for b
is 62, l
is 6c and å
is c3 a5. I am not sure what to make with the å
character. Here is my code:
#include <stdio.h>
int main(void) {
char myChar1 = 0x62; //b
char myChar2 = 0x6C; //l
char myChar3 = ?? //å
printf("%c", myChar1);
printf("%c", myChar2);
printf("%c", myChar3);
return 0;
}
I also tried this:
#include <stdio.h>
#define SIZE 100
int main(void) {
char myWord[SIZE] = "\x62\x6c\xc3\xa5\x00";
printf("%s", myWord);
return 0;
}
However, the output was:
blå
Finally, I tried this:
#include <stdio.h>
#include <locale.h>
#define SIZE 100
int main(void) {
setlocale(LC_ALL, ".UTF8");
char myWord[SIZE] = "\x62\x6c\xc3\xa5\x00";
printf("%s", myWord);
return 0;
}
Same output as before.
I am not sure I understand unicode fully. If I understand it correctly, UTF-16 and UTF-32 use wide characters, where each character requires the same number of bytes (2 or 4 for UTF-16). On the other hand, UTF-8 uses wide characters where the size may vary (1-4 bytes). I know the first 128 characters require 1 byte, and almost all of latin-1 can be described with 2 bytes etc. Since UTF-8 does not require wide characters, I do not need to use wchar functions in my code. Therefore, I do not see why my second and/or third code will not work. My only solution would be to include setmode
to change the encodings of stdin
and stdout
, although I am not sure I that would work and I am not sure how to implement it.
Summary:
Why doesn't my code work?
I am on windows and VScode and have MINGW32 as compiler.