C# Byte[] Byte array to Unicode string

Question

I need very fast conversion from byte array to string. Byte array is Unicode string.

enter image description here

"fastest" would be in-place, but that is impossible under .net (or so I think)... So second-best is "with no conversion, a simple copy array to array (where the second array is the "internal" array-of-chars of the string)". — xanatos, Feb 20 '11 at 10:20

xanatos · Accepted Answer · 2011-10-26T15:46:39.920

81

From byte[] array to string

 var mystring = Encoding.Unicode.GetString(myarray);

From string to byte[]

 var myarray2 = Encoding.Unicode.GetBytes(mystring);

edited Oct 26 '11 at 15:46

answered Feb 20 '11 at 10:08

xanatos

109,618
12
197
280

@xanatos why do you use `var` type? – Alex Apr 09 '13 at 11:30
7

@Alex This is C#, not Javascript. `var` isn't a type. It's an abbreviation for "the type of the right-hand expression" (so `string`) (see for example http://stackoverflow.com/questions/41479/use-of-var-keyword-in-c-sharp ) – xanatos Apr 09 '13 at 13:21
3

@xanatos, @alex technically `var` in JavaScript is not a type like you're saying. `var` in JavaScript is a keyword meaning _I'm wanting to declare a variable here_ which type is inferred later. – sabotero Nov 22 '13 at 10:53
I had to use `Encoding.ASCII.GetString` instead of `Encoding.Unicode.GetString` in my case to get it to work. – SNag Jan 07 '16 at 17:43
1

@SNag, avoid usages of ASCII, this is only for the last resort, when you are absolutely known what are you doing and why. 1) By default first thing to play with must be UTF8 encoding 2) If for some reason you are dealing with an old non-unicode data you can try Encoding.Default 3) Only as a fallback if you explicitly going to stick with 0-127 byte values range and Latin chars only you can try ASCII (7bit that fits into 8bit) In the author's case it is clear that data is UTF16 encoded and this is why Encoding.Unicode is correct here. This is mostly used by inmemory blobs. – Dmitry Gusarov May 16 '18 at 10:19
If I use this then it returns diamonds in the result. Where as this works: `Convert.ToBase64String` - why is that? – variable Jan 05 '22 at 11:15
@variable It all depends on how the string was encoded in the byte array. There are many ways to encode it, and for each way to encode there is a single way to decode. – xanatos Jan 05 '22 at 12:55

score 9 · Answer 2 · answered Feb 20 '11 at 10:07

9

Try this

System.Text.UnicodeEncoding.Unicode.GetString

answered Feb 20 '11 at 10:07

Anuraj

18,859
7
53
79

If I use this then it returns diamonds in the result. Where as this works: `Convert.ToBase64String` - why is that? – variable Jan 05 '22 at 11:15

score 1 · Answer 3 · answered Feb 09 '17 at 19:09

1

UTF8 (I think you mean "UTF8" instead of "Unicode"). Because, U'll get just Chinese Symbols. ;)

Maybe it helps to change...

var mystring = Encoding.Unicode.GetString(myarray);

...to...

var mystring = Encoding.UTF8.GetString(myarray);

:)

answered Feb 09 '17 at 19:09

Froschkoenig84

566
4
13

No, his data is clearly UTF16 (Encoding.Unicode) encoded.A\0n\0o\0n\0i\0... Both UTF8 and Unicode (UTF16) perfectly works with Chinese and any other symbol in Universal Coded Character Set (UCS). UTF8 is the default in most cases indeed, for data serialization, persisting and transfer. UTF16 is mostly in use by OS and frameworks for inmemory data to reduce encoding costs. – Dmitry Gusarov May 16 '18 at 10:27

C# Byte[] Byte array to Unicode string

3 Answers3

Linked