4

I'm trying to create a string with emoji "" starting from this string "D83DDC71". For doing that I'm trying to convert the string above in this string "\uD83D\uDC71".

If i use this code it work (textbox shows as expected):

textbox.Text += "\uD83D\uDC71";

but if i use this it doesn't work (textbox shows exact text "\uD83D\uDC71" instead of single character):

textbox.Text += sender.Code.ToString("X").insert(4, @"\u").insert(0, @"\u");

What is the right way to convert hex representation of an emoji to a corresponding C# string (UTF-16)?

Alexei Levenkov
  • 98,904
  • 14
  • 127
  • 179
frenk91
  • 919
  • 1
  • 15
  • 30
  • the problem is that if i put into this "\uD83D\uDC71" in to the textbox it work, but if i append \u to this string "D83DDC71" i only get a string and not an emoji – frenk91 Jul 07 '15 at 14:58
  • the standard one of windowsphone – frenk91 Jul 07 '15 at 15:00
  • Are you trying to output "person with blond hair" ? – Jon Hanna Jul 07 '15 at 15:02
  • You could split the string into 4 char chunks & the rejoin them with "\u" separator. See here to split into chnaks : http://stackoverflow.com/questions/1450774/splitting-a-string-into-chunks-of-a-certain-size & here for joining https://msdn.microsoft.com/en-us/library/57a79xd0(v=vs.110).aspx – PaulF Jul 07 '15 at 15:04
  • emoji in general, but in this case yes – frenk91 Jul 07 '15 at 15:04

3 Answers3

6

Okay. It seems you have a string which gives the hexadecimal of each of the UTF-16 code units of the character U+1F471 ().

Since char represents a UTF-16 code unit, split the string into two 4-character chunks, parse that into an int as hexadecimal, cast each to char and then combine them into a string:

var personWithBlondHair = ""
  + (char)int.Parse("D83DDC71".Substring(0, 4), NumberStyles.HexNumber)
  + (char)int.Parse("D83DDC71".Substring(4, 4), NumberStyles.HexNumber);

As per https://dotnetfiddle.net/oTgXfG

Jon Hanna
  • 110,372
  • 10
  • 146
  • 251
3

You have a string containing two shorts in hexadecimal form, so you need to parse them first. My example uses an overload of Convert.ToInt16 which also accepts an integer specifying the base of the integers in the string which, in our case, is 16 (hexadecimal).

string ParseUnicodeHex(string hex)
{
    var sb = new StringBuilder();
    for (int i = 0; i < hex.Length; i+=4)
    {
        string temp = hex.Substring(i, 4);
        char character = (char)Convert.ToInt16(temp, 16);
        sb.Append(character);
    }
    return sb.ToString();
}

Please note that this method will fail if the string's length isn't divisible by 4.

The reason this works:

textbox.Text += "\uD83D\uDC71";

is because you've got a string literal containing unicode character escape sequences. When you compile your program, the compiler replaces these escape sequences with the correct unicode bytes. This is why you cannot just add \u in front of the characters during execution to make it work.

cbr
  • 12,563
  • 3
  • 38
  • 63
0

Try this one

        string str = "D83DDC71";
        string emoji = string.Join("", (from Match m in Regex.Matches(str, @"\S{4}")
            select (char) int.Parse(m.Value, NumberStyles.HexNumber)).ToArray());

This will Separate your string 4 by 4 into array of strings. then it will convert each of strings into char. Finally it will Join all the chars into one string as emoji. all in one line.

M.kazem Akhgary
  • 18,645
  • 8
  • 57
  • 118
  • What `@"\u" + m.Value` supposed to do? – Alexei Levenkov Jul 07 '15 at 15:12
  • ex : `m.Value` will be `"D83D"`. then it will be `"\uD83D"` @AlexeiLevenkov – M.kazem Akhgary Jul 07 '15 at 15:14
  • Have you seen the question? Which is basically "why `"\uD83D\uDC71"` produces emoji while `@"\u"+ "d83d" + @"\" + "DC71"` does not"... So not sure how your post answers it. Additionally `\d` not going to match letters, but that is less interesting. – Alexei Levenkov Jul 07 '15 at 15:18
  • No, it will be `@"\uD83D"` and you need `"\uD83D"`. But it won't even be that because you are passing `""` instead of `str` to the regular expression matching. Also the regular expression is looking for decimal digits, not hex digits. – Jon Hanna Jul 07 '15 at 15:19
  • @JonHanna that was typo. i try this. maybe i need to convert m.Value to unicode char. but i thought this may work – M.kazem Akhgary Jul 07 '15 at 15:20
  • There are still no matches for `\d{4}` in `str`, and still no point adding `@"\u"` to things. – Jon Hanna Jul 07 '15 at 15:23
  • yes. it should be `\S`. i fixed it. however i used your algorithm to convert to char. @JonHanna – M.kazem Akhgary Jul 07 '15 at 15:31
  • @AlexeiLevenkov see the edit. how ever using regex was just an idea but the point was to convert unicode string into char. – M.kazem Akhgary Jul 07 '15 at 15:47
  • Looks correct to me now. I'd personally not use regex for such thing as splitting string into fixed size chunks but that completely opinion based :). – Alexei Levenkov Jul 07 '15 at 16:19