I'm trying to use ICU's StringCharacterIterator
to copy (and possibly alter) characters from a source string to a destination string. However, I am having unexpected results and am unsure why.
I would expect the final line of output of this program to be dog
but instead I get og￿
#include <iostream>
#include <icu4c/unicode/schriter.h>
int main()
{
UnicodeString dog = UnicodeString::fromUTF8("dog");
StringCharacterIterator chars(dog);
UnicodeString copy;
while(chars.hasNext())
copy.append(chars.next32());
for(int i=0; i<copy.countChar32(); i++)
{
int32_t charNumber = copy.char32At(i);
std::cout << charNumber << "\n";
}
std::string stdString;
copy.toUTF8String(stdString);
std::cout << stdString;
}
Program Output
111
103
65535
og￿
Unicode table
111 - latin small letter o
103 - latin small letter g