1

Here is a sample input: "\\u0434\\u0430\\u043C\\u043E" and i want to convert this into readable text. I will appreciate if it can still have the accent characters too. The input can be actually longer than this but this can be used as a sample.

Yes i saw (http://www.joelonsoftware.com/articles/Unicode.html) and (How to print/store non-ASCII characters (unicode?)) but it doesn't answer my question so please don't label this as duplicate. I would appreciate to get a sample code in C#. I also tried HttpUtility.HtmlDecode() too but it doesn't actually decode it. Here is the code:

//this is coming from service call and its comming just like this.
var str="\\u0434\\u0430\\u043C\\u043E"; 
var decoded = HttpUtility.HtmlDecode(str); // this doesn't work. Its returning the string str as is.

As a side note: the following will work. But my input isn't in that shape.

//Although my input isn't in the following form, the following works. But my input isn't in this form.
var str2="\u0434\u0430\u043C\u043E";
var decoded = HttpUtility.HtmlDecode(str2);

How can i correctly decode a string like ""\u0434\u0430\u043C\u043E" to readable text.

Community
  • 1
  • 1
user3818435
  • 707
  • 2
  • 9
  • 16
  • @AlexeiLevenkov, HttpUtility.HtmlDecode(str) is returning me str as is without decoding it. Here is the code: var str="\\u0434\\u0430\\u043C\\u043E"; var decoded = HttpUtility.HtmlDecode(str); Remember, I'm getting str from a service call and has the escape character as shown above. How will HttpUtility help here? – user3818435 Aug 12 '14 at 04:24
  • I had wrong suggestion... reopened. Can you please clarify if you input sample is `@"\u0434"` or `"\u0434"`? – Alexei Levenkov Aug 12 '14 at 04:34
  • Maybe this is what you are looking for - http://stackoverflow.com/questions/13764168/read-utf8-unicode-characters-from-an-escaped-ascii-sequence – Alexei Levenkov Aug 12 '14 at 04:47
  • Finally a friend i got it working. It turns out i have to use Regex.Unscape() method. Like this: var str = "\\u0434\\u0430\\u043C\\u043E"; var decoded = HttpUtility.HtmlDecode(Regex.Unescape(str)); – user3818435 Aug 12 '14 at 05:39
  • You should post answer as answer and accept it, don't post answer as part of the question since it is not clear if question have an answer. ... Or vote to close as duplicate of something like http://stackoverflow.com/questions/8558671/how-to-unescape-unicode-string-in-c-sharp – Alexei Levenkov Aug 12 '14 at 06:33

1 Answers1

0

I finally got it working:

I got it working by using Regex.Unscape() method. In case someone else runs into the same problem, here is how the issue is resolved:

  var str = "\\u0434\\u0430\\u043C\\u043E";
  var decoded = HttpUtility.HtmlDecode(Regex.Unescape(str)); //take a look the Regex.Unscape() call.
user3818435
  • 707
  • 2
  • 9
  • 16