0

I have a json string containing encoded HTML as below which I get after doing a Shopify Liquid escape. I am trying to decode the internal HTML and deserialize this string into a JObject.

{
    "testId": 494254,
    "languageIdentifier": "en_us",
    "overview":"<p style="margin: 0px;"><span>Overview'ff' Test</span></p>",  
    "responsibilities":"<p style="margin: 0px;"><span>Responsibilities</span></p>",
    "qualifications":"<p style="margin: 0px;"><span>Qualifications</span></p>",
    "guidance":"z_used_Guidance Test",
    "additionalDetailsForInternalCandidates":"<p style="margin: 0px;"><span>Additional Details</span></p>",
    "requisitionNotes":"Requisition Notes"
}

The actual html is:

{
    "testId": 494254,
    "languageIdentifier": "en_us",
    "overview": "<p style=\"margin: 0px;\"><span>Overview'ff' Test</span></p>"
    "responsibilities": "<p style=\"margin: 0px;\"><span>Responsibilities</span></p>"
    "qualifications": "<p style=\"margin: 0px;\"><span>Qualifications</span></p>"
    "additionalDetailsForInternalCandidates":"<p style=\"margin: 0px;\"><span>Additional Details</span></p>",
    "guidance":"z_used_Guidance Test",
    "requisitionNotes":"Requisition Notes"
}

However, when I try to decode and deserialize, it is failing. My code is:

string test = "{\r\n\t\"testId\": 494254,\r\n\t\"languageIdentifier\": \"en_us\",\r\n\t\"overview\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Overview&#39;ff&#39; Test&lt;/span&gt;&lt;/p&gt;\",\t\r\n\t\"responsibilities\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Responsibilities&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"qualifications\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Qualifications&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"guidance\":\"z_used_Guidance Test\",\r\n\t\"additionalDetailsForInternalCandidates\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Additional Details&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"requisitionNotes\":\"Requisition Notes\"}";
var jsonString = HttpUtility.HtmlDecode(test);
var objectjson = JsonConvert.DeserializeObject<JObject>(jsonString);

However, I get this error:

'After parsing a value an unexpected character was encountered: m. Path 'overview', line 4, position 23.'

Can anyone help me in decoding an encoded HTML and deserialize it into JObject? I want it as a decoded html string like the original input

{
    "testId": 494254,
    "languageIdentifier": "en_us",
    "overview": "<p style=\"margin: 0px;\"><span>Overview'ff' Test</span></p>",
    "responsibilities": "<p style=\"margin: 0px;\"><span>Responsibilities</span></p>",
    "qualifications": "<p style=\"margin: 0px;\"><span>Qualifications</span></p>",
    "additionalDetailsForInternalCandidates":"<p style=\"margin: 0px;\"><span>Additional Details</span></p>",
    "guidance":"z_used_Guidance Test",
    "requisitionNotes":"Requisition Notes"
}

Thanks in advance.

jjo
  • 2,595
  • 1
  • 8
  • 16
Na.B
  • 46
  • 6
  • What is the actual "JSON" that you're receiving? Not the view you get when you hover over the string, but the view you get when you click the ? – ProgrammingLlama Jul 08 '20 at 09:42
  • I want a decoded html string and HtmlDecode was failing without doing the replace. – Na.B Jul 08 '20 at 09:42
  • At the moment the strings you're showing have a lot of escape characters that shouldn't be present in JSON outside of strings (i.e. within the JSON, not as part of the format). – ProgrammingLlama Jul 08 '20 at 09:44
  • { "testId": 494254, "languageIdentifier": "en_us", "overview":"

    Overview 'Test'

    ", "responsibilities":"

    Responsibilities

    ", "qualifications":"

    Qualifications

    ", "guidance":"z_used_Guidance Test", "additionalDetailsForInternalCandidates":"

    Additional Details

    ", "requisitionNotes":"Requisition Notes"}
    – Na.B Jul 08 '20 at 09:45
  • 1
    If that's genuinely the string you're receiving, then you've got some problems. Within `"overview"`, you have unescaped quotes for its content. The JSON is therefore badly formed. P.S. I recommend adding that as an _edit_ to your question. – ProgrammingLlama Jul 08 '20 at 09:46
  • Is there a way to remove such escaping? – Na.B Jul 08 '20 at 09:46
  • You've got it the wrong way round. There isn't any escaping. This is the issue. – ProgrammingLlama Jul 08 '20 at 09:47
  • For all of the HTML attributes, I would expect to see `\"`, so that it's clear that it's part of the text, and not part of the JSON format. As it is, JSON.NET interprets the value of "overview" as `

    – ProgrammingLlama Jul 08 '20 at 09:49
  • The actual HTML is

    Overview Test

    . We are doing an escape using Shopify Liquid Templates and after doing the escape we get this string. And there is no scope of unescape in Shopify Liquid. So we have to do manually.
    – Na.B Jul 08 '20 at 09:50
  • It seems to be neither here nor there how the data looks in Shopify. The fact is, the string you've provided above doesn't have any escaping of quotes in the HTML, therefore it's far from a simple task for a computer to decide which part is JSON and which part is HTML. If you fix the problem where that string is generated, you fix your entire issue. I'm not really sure how I can make that clearer. – ProgrammingLlama Jul 08 '20 at 09:53
  • Sure let me test the source of the string. Will update in this thread. – Na.B Jul 08 '20 at 09:56
  • I have updated the description, can you help me with the ask? Please let me know if you need further information. – Na.B Jul 08 '20 at 10:31

2 Answers2

1

You assume that you have an HTML-encoded JSON file. That is not the case. What you have is a JSON file containing HTML-encoded values. That's something different.

It means that you first need to parse the JSON and then HTML-decode the value.

string jsonString = @"{
    ""testId"": 494254,
    ""languageIdentifier"": ""en_us"",
    ""overview"":""&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Overview&#39;ff&#39; Test&lt;/span&gt;&lt;/p&gt;"",  
    ""responsibilities"":""&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Responsibilities&lt;/span&gt;&lt;/p&gt;"",
    ""qualifications"":""&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Qualifications&lt;/span&gt;&lt;/p&gt;"",
    ""guidance"":""z_used_Guidance Test"",
    ""additionalDetailsForInternalCandidates"":""&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Additional Details&lt;/span&gt;&lt;/p&gt;"",
    ""requisitionNotes"":""Requisition Notes""
}";

var objectjson = JsonConvert.DeserializeObject<JObject>(jsonString);
var htmlEncodedValue = objectjson.Value<string>("overview");
var decodedValue = HttpUtility.HtmlDecode(htmlEncodedValue);
Heinzi
  • 167,459
  • 57
  • 363
  • 519
0

You have to do as per below code in C#

  1. libraries

    using Newtonsoft.Json;
    using Newtonsoft.Json.Linq;
    
  2. code

    string test = "{\r\n\t\"testId\": 494254,\r\n\t\"languageIdentifier\": \"en_us\",\r\n\t\"overview\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Overview Test&lt;/span&gt;&lt;/p&gt;\",\t\r\n\t\"responsibilities\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Responsibilities&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"qualifications\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Qualifications&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"guidance\":\"z_used_Guidance Test\",\r\n\t\"additionalDetailsForInternalCandidates\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Additional Details&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"requisitionNotes\":\"Requisition Notes\"}";
    
    JToken objectData = JToken.Parse(test);
    var testId = objectData["testId"];
    var languageIdentifier = objectData["languageIdentifier"];
    
  3. If you want to do it from html instead of c# then do as per below.

    <!DOCTYPE html>
    <html>
    <body>
    <h2>Create Object from JSON String</h2>
    <p id="demo"></p>
    
    <script>
    var txt = "{\r\n\t\"testId\": 494254,\r\n\t\"languageIdentifier\": \"en_us\",\r\n\t\"overview\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Overview Test&lt;/span&gt;&lt;/p&gt;\",\t\r\n\t\"responsibilities\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Responsibilities&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"qualifications\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Qualifications&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"guidance\":\"z_used_Guidance Test\",\r\n\t\"additionalDetailsForInternalCandidates\":\"&lt;p style=&quot;margin: 0px;&quot;&gt;&lt;span&gt;Additional Details&lt;/span&gt;&lt;/p&gt;\",\r\n\t\"requisitionNotes\":\"Requisition Notes\"}"
    var obj = JSON.parse(txt);
    document.getElementById("demo").innerHTML = "TestID : " + obj.testId + " and languageIdentifier : " + obj.languageIdentifier;
    </script>
    </body>
    </html>