8

i am trying to assign Unicode on string but it return "Привет" string as "Привет" But i need "Привет", i am converting by following function .

public string Convert(string str)
{
    byte[] utf8Bytes = Encoding.UTF8.GetBytes(str);
    str = Encoding.UTF8.GetString(utf8Bytes);
    return str;
}

what can i do for solve this problem to return "Привет".

manoj
  • 5,235
  • 7
  • 24
  • 45
  • 1
    See http://stackoverflow.com/questions/4184190/c-unicode-to-string-conversion – Mihai8 Jan 28 '13 at 10:45
  • 2
    What are you doing with the result of the method? I'm quite sure the problem is there, or from the input. Your code is actually returning exactly what you pass as input – Steve B Jan 28 '13 at 10:46
  • 1
    Isn't `Encoding.UTF8.GetBytes` followed by `GetString` essentially a noop? – Rawling Jan 28 '13 at 10:48
  • UTF8 is different from unicode? Your resulting string will not be unicode as you are converting to utf8. What is your input string and expected output string? – Peter H Jan 28 '13 at 10:51
  • This code seems to work for me as it is. Where are you reading the resultant value ? – Ravi Y Jan 28 '13 at 10:51
  • @PeterH: Yes the result WILL be unicode. He's converting from UTF16 to UTF8 and back to UTF16. Essentially, the function does nothing but return str. – Matthew Watson Jan 28 '13 at 10:57

2 Answers2

9

П is Unicode character 0x041F, and its UTF-8 encoding is 0xD0 0x9F resulting in П.

Since the function only returns the input parameter, as commenters already discussed, I conclude that your original input string is actually in UTF-8, and you want to convert it into native .Net string.

Where does the original string come from?

Instead of reading the input into a C# string, change your code to read a byte[], and then call Encoding.UTF8.GetString(inputUtf8ByteArray).

devio
  • 36,858
  • 7
  • 80
  • 143
5

I Tried the following code below and these were my results:

        string test="Привет";
        byte[] utf8Bytes = Encoding.UTF8.GetBytes(test);

        String str1 = Encoding.Unicode.GetString(utf8Bytes);
        String str2 = Encoding.UTF8.GetString(utf8Bytes);

Output of str1=鿐胑룐닐뗐苑

Output of str2=Привет

Peter H
  • 871
  • 1
  • 10
  • 33
  • ya it works. Actually, I found my own mistake. Actually i was reading string from file so changed it to File.ReadAllBytes(Filename), and it worked!! – manoj Jan 28 '13 at 11:25