I need to convert unicode string to unicode characters.
for eg:Language Tamil
"கமலி"=>'க','ம','லி'
i'm able to strip unicode bytes but producing unicode characters is became problem.
byte[] stringBytes = Encoding.Unicode.GetBytes("கமலி");
char[] stringChars = Encoding.Unicode.GetChars(stringBytes);
foreach (var crt in stringChars)
{
Trace.WriteLine(crt);
}
it gives result as :
'க'=>0x0b95
'ம'=>0x0bae
'ல'=>0x0bb2
'ி'=>0x0bbf
so here the problem is how to strip character 'லி' as it as 'லி' without splitting like 'ல','ி'.
since it is natural in Indian language by representing consonant and vowel as single characters but parsing with c# make difficulty.
All i need to be split into 3 characters.