I'm passing GCC a UTF-32 string and it's complaining about an invalid multibyte or wide character.
I tested this in Clang, and I got the same error message.
I wrote the statement originally with MSVC, and it worked alright.
Here's the assert statement.
assert(utf_string_copy_utf32(&string, U"¿Cómo estás?") == 0);
Here's the declaration.
int utf_string_copy(struct utf_string * a, const char32_t * b);
Here's the compile command:
cc -Wall -Wextra -Werror -Wfatal-errors -g -I ../include -fexec-charset=UTF-32 string-test.c libutf.a -o string-test
Am I to assume that GCC can only recognize Unicode characters by the escape sequences?
Or am I misunderstanding how GCC and CLang recognize these characters.
Edit 1
Here's the error message.
string-test.c: In function ‘test_copy’:
string-test.c:46:61: error: converting to execution character set: Invalid or incomplete multibyte or wide character
assert(utf_string_copy_utf32(&string, U"�C�mo est�s?") == 0);
Edit 2
I'm even more confused now that I've tried to recreate the bug in a smaller example.
#include <uchar.h>
#include <stdlib.h>
#include <stdio.h>
static size_t test_utf8(const char * in){
size_t len;
for (len = 0; in[len]; len++);
return len;
}
static size_t test_utf32(const char32_t * in){
size_t len;
for (len = 0; in[len]; len++);
return len;
}
int main(void){
size_t len;
len = test_utf8(u8"¿Cómo estás?");
printf("utf-32 length: %lu\n", len);
len = test_utf32(U"¿Cómo estás?");
printf("utf-32 length: %lu\n", len);
return 0;
}
This prints:
utf-8 length: 15
utf-32 length: 12
This reaffirms the way I originally thought it worked.
So I guess that means there's a problem somewhere in the library code that I'm using. But I still have no idea what's going on.