Can someone please explain why the 1st function works but the 2nd doesn't?
unsigned int utf8_count(char* in)
{
unsigned int i = 0, c = 0;
while (in[i])
{
if ((in[i] & 0xc0) != 0x80)
c++;
i++;
}
return c;
}
unsigned int utf8_count(char* in, unsigned int in_size)
{
unsigned int i = 0, c = 0;
while (i < in_size)
{
if ((in[i] & 0xc0) != 0x80)
c++;
i++;
}
return c;
}
I understand what (in[i] & 0xc0) != 0x80
does but I don't understand why i < in_size != in[i]
?
Example string: ゴールデンタイムラバー/スキマスイッチ
57 bytes, 19 characters.
Why utf8_count(in, 57)
return 57 and not 19?
The binary representation of the example string: