Is there a way to use supplemental Unicode characters (for example ''
) as char literals in C#? I tried it in VS 2017, with the source file saved as UTF-8 with BOM, UTF-16 LE and BE and I always get the error Too many characters in character literal
.
Asked
Active
Viewed 1,105 times
2

Victor Grigoriu
- 417
- 5
- 18
-
The `char` type is effectively a single utf-16 code point. If the character is not a single utf-16 code point then no. – Mike Zboray Feb 18 '17 at 19:20
-
1FWIW it is possible to represent it as a string, `"\uD83C\uDCDC"`. – Mike Zboray Feb 18 '17 at 19:24
-
@mikez Note: It's not necessary to use the \u notation. – Tom Blodget Feb 18 '17 at 19:41
-
1@mikez: Or just use `\U` instead: `"\U0001F0DC"` – Jon Skeet Feb 18 '17 at 19:41
-
1@TomBlodget: It's not *necessary*, but it does mean you don't need to worry about encodings as much, if all your source code is ASCII. – Jon Skeet Feb 18 '17 at 19:42
-
1@JonSkeet I wouldn't do that with source code files because that could lead to making the same assumption about other text files—and that just won't do. – Tom Blodget Feb 18 '17 at 19:51
-
Out of curiosity, is there any language that treats codepoints as first class concepts? – Victor Grigoriu Feb 18 '17 at 19:52
-
@TomBlodget: The difference is that I control how I treat other source files in my code, whereas it can (depending on platform etc) be slightly trickier - or at least annoying - to persuade all tools everywhere to handle source code as UTF-8. – Jon Skeet Feb 18 '17 at 22:14
-
@JonSkeet Yes, we do put up with cases of "the cobbler's children have no shoes" in our work. I've been lucky enough to be dogmatic with character encodings. – Tom Blodget Feb 19 '17 at 16:40
1 Answers
3
No, char
is one UTF-16 code unit. String
is a sequence of UTF-16 code units so if you have a codepoint that UTF-16 encodes as two code units, use a String
literal.
""

Tom Blodget
- 20,260
- 3
- 39
- 72
-
Right, I was just reading about how to get the codepoints out from a string: https://stackoverflow.com/questions/687359/how-would-you-get-an-array-of-unicode-code-points-from-a-net-string – Victor Grigoriu Feb 18 '17 at 19:50