0

I working on a C# Console App project and I've been asked to parse some JSON data from a webpage and pull certain values from the JSON, such as product price and colour.

My JSON data, pulled from a webpage using HTMLAgilityPack. I had to replace \" with " to make it valid JSON, another question, how can C# handle this properly?

{
    "currentAsinData": {
        "Asin": "B0013NCYX4",
        "buyingPPU": "",
        "variantImages": [
            {
                "tinyImage": {
                    "HEIGHT": "70",
                    "URL": "http: //ecx.images-amazon.com/images/I/419CBUN6h8L._SL110_.jpg",
                    "WIDTH": "110"
                },
                "swatchImage": {
                    "HEIGHT": "19",
                    "URL": "http: //ecx.images-amazon.com/images/I/419CBUN6h8L._SL30_.jpg",
                    "WIDTH": "30"
                },
                "mediumImage": {
                    "HEIGHT": "168",
                    "URL": "http: //ecx.images-amazon.com/images/I/419CBUN6h8L._SX168_.jpg",
                    "WIDTH": "168"
                },
                "largeImage": {
                    "HEIGHT": "270",
                    "URL": "http: //ecx.images-amazon.com/images/I/419CBUN6h8L._SX270_.jpg",
                    "WIDTH": "270"
                },
                "thumbnailImage": {
                    "HEIGHT": "120",
                    "URL": "http: //ecx.images-amazon.com/images/I/419CBUN6h8L._SX120_.jpg",
                    "WIDTH": "120"
                }
            }
        ]
    }
}

Now, the above JSON is correct as far as I know, but I'm unable to read the data as C# doesn't allow " and if I use \" my JArray fails to deserialize the object.

I'm new to JSON in C#, I am using the JSON.NET library, my end goal is hopefully decipher the JSON, so I can retrieve the data to a C# string for further usage. But I'm stuck as to how I can do this.

Thanking you in advance!

More information as requested.

My code to scrape the javascript JSON data is here.

string theScript = xd.SelectSingleNode(".//div[contains(@class,'webstore-ProductJSONData')]/script[contains(.,'var detailData')]").GetInnerXML().HtmlDecode();
        if(theScript != null)
        { 
            string[] varsln = Regex.Split(theScript, "var detailData =");
            string json = varsln[1].HtmlDecode().Replace("};\nvar extensibilityData = {};\n\r\n//]]>//", "").Trim();

            Console.WriteLine(json);
        }

The webpage I am taking the JSON from

http://www.dangleberrymusic.co.uk/Childrens-Childs-Electric-Guitar-  quarter/dp/B00ESEOXWK?class=quickView&field_availability=-1&field_browse=1592919031&id=Childrens+Childs+Electric+Guitar+quarter&ie=UTF8&refinementHistory=color_map%2Cbrandtextbin%2Csubjectbin%2Cprice%2Csize_name&searchNodeID=1592919031&searchPage=1&searchRank=salesrank&searchSize=12
Victor Sigler
  • 23,243
  • 14
  • 88
  • 105
user3091209
  • 81
  • 1
  • 5

2 Answers2

0

You can model your json as an object and then use json.Net to deserialize it.

AsinData ad = JsonConvert.DeserializeObject<AsinData>(json)
kdubau
  • 184
  • 1
  • 8
0

I think the problem here is that your javascript-scraping code is removing the trailing brace from the data, which then prevents it from being parsed as JSON correctly by JSON.net. You have this:

 .Replace("};\nvar extensibilityData = {};\n\r\n//]]>//", "")

But it should be this:

 .Replace(";\nvar extensibilityData = {};\n\r\n//]]>//", "")

Once you've got a correct JSON string you can deserialize it like this:

JToken token = JToken.Parse(json);  // works with either objects or arrays

From there you can use Json.Net's LINQ-to-JSON API to get the data you want from the JToken. The documentation has sample code that shows how to query for specific values.

Brian Rogers
  • 125,747
  • 31
  • 299
  • 300