5

I have a JSon response that contains lots of \u003c or \u00252 or other similar strings inside. I need a proper function in order to decode these strings into proper characters.

Programmer Bruce
  • 64,977
  • 7
  • 99
  • 97
EBAG
  • 21,625
  • 14
  • 59
  • 93

3 Answers3

4

There are various posts about how to deserialize JSON strings. Here shows a nice generic method for deserializing. The code below is taken from there.

public static T Deserialise<T>(string json)
{
T obj = Activator.CreateInstance<T>();
using (MemoryStream ms = new MemoryStream(Encoding.Unicode.GetBytes(json)))
{
    DataContractJsonSerializer serializer = new DataContractJsonSerializer(obj.GetType());
    obj = (T)serializer.ReadObject(ms); // <== Your missing line
    return obj;
} 
}

Having re-read your post if you are just looking for a way to convert the string to ASCII then check out this post. ORiginal Creadit to @Adam Sills for this code

static string DecodeEncodedNonAsciiCharacters( string value ) {
    return Regex.Replace(
        value,
        @"\\u(?<Value>[a-zA-Z0-9]{4})",
        m => {
             return ((char) int.Parse( m.Groups["Value"].Value, NumberStyles.HexNumber )).ToString();
        } );
}
Community
  • 1
  • 1
IndigoDelta
  • 1,481
  • 9
  • 11
  • Your second solution is magic for my problem. Because it reduces the overhead for deserializing the json response, since I wanted to parse it myself. – EBAG May 27 '11 at 08:45
2

Note I'm assuming you just have the data part of the string, not an entire JSON fragment - i.e.

string s = @"blah \u003c blah \u00252 blah";

If the above assumption is wrong and you have a full JSON fragment, just use JavaScriptSerializer to get an object from the data.

Annoyingly, HttpUtility has encode but not decode.

You could spoof the string into a full JSON object, though - this seems a bit overkill:

class Dummy
{
    public string foo { get; set; }
}
static void Main(string[] args)
{
    string s = @"blah \u003c blah \u00252 blah";
    string json = @"{""foo"":""" + s + @"""}";
    string unencoded = new JavaScriptSerializer().Deserialize<Dummy>(json).foo;
}
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • You need to escape the `\ ` in your example since else C# already translates it to the corresponding unicode character. Or use a `@""` raw string. – CodesInChaos May 27 '11 at 08:26
0

I'm not sure but I think you can construct a char directly with the unicode character code:

char c='\003C'; // c|60 '<'
InBetween
  • 32,319
  • 3
  • 50
  • 90