0

I have a string that contains utf-8 encoded emojis. The string is escaped. I need to convert the utf-8 into emojis and print them properly. For example:

input: \\xe2\\x80\\x9c@VineFights: He does not care Lamo!!! 
\\xf0\\x9f\\x98\\x82 https:\\/\\/t.co\\/TwmYFEhx9g\\xe2\\x80\\x9d\\xf0\\x9f\\x98\\x82\\xf0
\\x9f\\x98\\xad\\xf0\\x9f\\x98\\xad 

Expected output: He does not care Lamo!!!  URL”

This is one sinle string (without breaks). I have broken it down for fit in one view in this question.
My Idea is to extract emojis using regex (\\\\x[a-fA-F0-9]{2})+ and replace them by converting bytes manually into emojis. This failed in several cases like the one in example. It also feels like unnecessary hacky/ugly solution. What's the right way to handle it?

(More interested to know how this is actually done in real world. Any examples is appreciated)

Maxsteel
  • 1,922
  • 4
  • 30
  • 55
  • Check this out - http://stackoverflow.com/questions/24840667/what-is-the-regex-to-extract-all-the-emojis-from-a-string its the opposite of what you're doing but you'll get the idea. – Gurwinder Singh Nov 13 '16 at 02:16
  • @Gurwinder Thanks, but I did not get how to do it the other way around. – Maxsteel Nov 13 '16 at 02:24
  • 1
    Are you sure there are two consecutive backslashes in each hex escape sequence, rather than one (for instance, `\xe2`)? Many IDEs will escape backslashes in a String variable’s value, for display purposes. – VGR Nov 13 '16 at 04:45

0 Answers0