3

Possible Duplicate:
To which character encoding (Unicode version) set does a char object correspond?

I'm a little afraid to ask this, as I'm sure its been asked before, but I can't find it. Its probably something obvious, but I've never studied encoding before.

int Convert(char c)
{
    return (int)c;
}

What encoding is produced by that method? I thought it might be ASCII (at least for <128), but doing the code below produced... smiley faces as the first characters? What? Definitely not ASCII...

for (int i = 0; i < 128; i++)
    Console.WriteLine(i + ": " + (char)i);
Community
  • 1
  • 1
  • This may or may not answer your question: http://stackoverflow.com/questions/6549054/to-which-character-encoding-unicode-version-set-does-a-char-object-correspond – BoltClock May 13 '12 at 15:25
  • It did, thanks. How do I close this question as answered by that, or do you mind doing that? And to those of you looking at this question later, its UTF-16. I can't seem to find a table of the characters in utf-16 to verify the first ones are "smiley faces" but I assume they are correct. –  May 13 '12 at 15:33

2 Answers2

5

C# char uses the UTF-16 encoding. The language specification, 1.3 Types and variables, says:

Character and string processing in C# uses Unicode encoding. The char type represents a UTF-16 code unit, and the string type represents a sequence of UTF-16 code units.

UTF-16 overlaps with ASCII in that the character codes in the ASCII range 0-127 mean the same thing in UTF-16 as in ASCII. The smiley faces in your program's output are presumably how your console interprets the non-printable characters in the range 0-31.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
0

Each char is a UTF-16 code point. However, you should use the proper Encoding class to ensure that the unicode is normalized. See C# and UTF-16 characters

Community
  • 1
  • 1
Jeow Li Huan
  • 3,758
  • 1
  • 34
  • 52