First I develope an independent platform library by using ANSI C (not C++ and any non standard libs like MS CRT or glibc, ...).
After a few searchs, I found that one of the best way to internationalization in ANSI C, is using UTF-8 encoding.
In utf-8:
- strlen(s): always counts the number of bytes.
- mbstowcs(NULL,s,0): The number of characters can be counted.
But I have some problems when I want to random access of elements(characters) of a utf-8 string.
In ASCII encoding:
char get_char(char* assci_str, int n)
{
// It is very FAST.
return assci_str[n];
}
In UTF-16/32 encoding:
wchar_t get_char(wchar_t* wstr, int n)
{
// It is very FAST.
return wstr[n];
}
And here my problem in UTF-8 encoding:
// What is the return type?
// Because sizeof(utf-8 char) is 8 or 16 or 24 or 32.
/*?*/ get_char(char* utf8str, int n)
{
// I can found Nth character of string by using for.
// But it is too slow.
// What is the best way?
}
Thanks.