2

I am working on a little project to process some big data from a game, that will enable me to view some advanced analytics, However, I have come up against a wall .... The following URL (Click Here) is an open API that returns a JSON result and I am trying to process this as my own data.

I have made a class that should process this data into my model, however, every time line 5 is called client.DownloadString i receive an error 403, is there any way around this? I do not know the owner of the api.

public IActionResult Index(object sender, EventArgs e)
{
    var model = new FresnoVm();
    WebClient client = new WebClient();
    string strPageCode = client.DownloadString("https://api.upx.world/bigdata/query?neighborhood=210&neighborhood=359&neighborhood=367&neighborhood=366&neighborhood=356&neighborhood=364&city=0&status=All&mintMin=0&mintMax=100000000&saleMin=0&saleMax=100000000&skip=0&fsa=All&sort=mint_price&ascOrDesc=1");
    dynamic dobj = JsonConvert.DeserializeObject<dynamic>(strPageCode);

    price = dobj["data"]["properties"]["sale_price_upx"].ToString();

    model.test = price;

    return View("~/Features/Fresno/Index.cshtml", model);
}
dbc
  • 104,963
  • 20
  • 228
  • 340
M.E_
  • 87
  • 1
  • 12
  • 2
    Does this answer your question? [WebClient 403 Forbidden](https://stackoverflow.com/questions/3272067/webclient-403-forbidden) – Klaus Gütter Apr 03 '21 at 13:59
  • 1
    This is unrelated to the 403 error, but if you are receiving JSON larger than 85,000 bytes you should deserialize it directly from the response stream rather than downloading it as a string and parsing that. deserialize directly from the response stream as suggested in e.g. https://www.newtonsoft.com/json/help/html/Performance.htm#MemoryUsage – dbc Apr 03 '21 at 14:29

2 Answers2

3

While adding "User-Agent: Other" to wb.Headers as suggested in this answer by Borg8 to WebClient 403 Forbidden is sufficient to download the response string, since the returned string is roughly 1.6MB it would be better to deserialize directly from the response stream, as recommended in Newtonsoft's Performance Tips: Optimize Memory Usage.

First, define the following helper method:

public static class JsonExtensions
{
    public static T GetFromJson<T>(string url, JsonSerializerSettings settings = default)
    {
        var request = (HttpWebRequest)HttpWebRequest.Create(url);
        // User-Agent as suggested by this answer https://stackoverflow.com/a/6905471/3744182
        // by https://stackoverflow.com/users/873595/borg8
        // To https://stackoverflow.com/questions/3272067/webclient-403-forbidden
        request.UserAgent = "Other";

        using (var response = (HttpWebResponse)request.GetResponse())
        {
            if (response.StatusCode == HttpStatusCode.OK)
            {
                using (var stream = response.GetResponseStream())
                using (var textReader = new StreamReader(stream))
                {
                    settings = settings ?? new JsonSerializerSettings { CheckAdditionalContent = true };
                    return (T)JsonSerializer.CreateDefault(settings).Deserialize(textReader, typeof(T));
                }
            }
            else
            {
                throw new ApplicationException(); // Throw some exception with a message of your choice
            }
        }
    }
}

And then you can do:

var dobj = JsonExtensions.GetFromJson<JToken>(url);
var prices = dobj["data"]["properties"].Select(t => (decimal?)t["last_paid_price_upx"]).ToList(); // Or cast to (string) if you prefer

Notes:

  • That there is no property named "sale_price_upx" in the returned JSON. Some, but not all, of the data.properties[*] objects contain a "last_paid_price_upx" property, so the code above shows an example of how to extract those as nullable decimals.

  • the LINQ-to-JSON document object model can use significant amounts of memory for property names. You may get better performance deserializing large amounts of data to directly an explicit data model that contains only the properties you need.

Demo fiddle here.

dbc
  • 104,963
  • 20
  • 228
  • 340
  • Perfect thank you. Do you know why there is no property named sale_price_upx? this is the main one i need and when you click on the URL it is displayed in the response. – M.E_ Apr 03 '21 at 16:57
  • @M.E_ - no idea. I don't know about `api.upx.world` specifically. – dbc Apr 03 '21 at 17:08
1

Add the simple line like below

wb.Headers.Add("User-Agent: Other");   //that is the simple line!
M.E_
  • 87
  • 1
  • 12
  • While that is sufficient to download the string, it would be better to deserialize from a stream without creating the intermediate 1.7MB string as shown here: https://dotnetfiddle.net/MKkuUi. – dbc Apr 03 '21 at 16:35
  • Thank you very much let me give this a shot now! – M.E_ Apr 03 '21 at 16:38