2

I have a string with some wonky characters (for example) " ". I need to check if a List contains the first item in the string. But if I index it, it always becomes \ud835. After using Char.ConvertFromUtf32(\ud835) and some other attempts, I simply can't find out how to get the first item as a "".

Adam Dernis
  • 530
  • 3
  • 14
  • I'm not following. `'\Ud835'` is a "high surrogate" and not a valid character by itself. Is your string "Lead Backend" and rendered in a wonky font, or is that lead character really wonky and represented by a Unicode surrogate pair? – Flydog57 Aug 10 '18 at 23:14
  • @Flydog57 it's represented by a Unicode surrogate pair – Adam Dernis Aug 10 '18 at 23:15
  • 1
    This might help: https://stackoverflow.com/questions/14347799/how-do-i-create-a-string-with-a-surrogate-pair-inside-of-it. Otherwise, search around for stuff on "surrogate pairs". I've never had to play with them – Flydog57 Aug 10 '18 at 23:20
  • What is the "first item from the string"? ? ? – Jacob Krall Aug 11 '18 at 00:04
  • @JacobKrall "" is the first item – Adam Dernis Aug 11 '18 at 00:07

1 Answers1

4

is represented with a surrogate pair in UTF-16, the encoding used by .NET.

A surrogate pair is represented with two characters:

        var s = " ";
        Console.WriteLine(new string(new[] { s[0], s[1] }) == "");

There are built-in helper methods like Char.ConvertToUtf32 and Char.IsSurrogate which you can use to figure out if you are in this situation.

Jacob Krall
  • 28,341
  • 6
  • 66
  • 76