I want to determine whether the last character in the buffer defined as the bytes between begin and end is English or Japanese. I read about uTF-8 where Japanese characters are two bytes long and always have 1 in the high bit of the high byte, whereas low byte can have either 1 or 0 in the high bit.
I am trying to return integer 2 for Japanese(2Bytes), 1 for English and 0 for data in buffer is malformed.
public static int NumChars(byte begin, byte end). Can you point me to the right direction? I am confused how to approach this. I was thinking about using xor to find if the MSB in high bit is 1 then return 2, but I have a doubt even if I understood correctly.