-1

I have a personal C# project that keeps track of live sports data during an event. It does this by scraping a JSON file on the sport's website. The JSON file is continuously updated during the sports event.

However, the page does NOT refresh itself. The existing file is simply overwritten. To monitor the data in real time as desired, I have to send requests continously for 2-4 hours -- from the start of the event, to the end.

My code is configured to loop endlessly until I hit the Esc key:

string url = "https://www.example.com/live/feeds/stats.json";
while (!(Console.KeyAvailable && Console.ReadKey(true).Key == ConsoleKey.Escape))
{
    try
    {
        string json = (new WebClient()).DownloadString(url);

        // parse JSON
        ...
    }
    catch (...)
    {
        ...
    }
}

My questions are:

  • If I do send such a high volume of requests for hours at a time, am I at risk of having my IP address blacklisted and access denied?
  • Is it possible to monitor this JSON file continuously without sending a million requests?
    • Can this be done using another language/framework? It doesn't need to be in C#.
jmwilkes
  • 19
  • 3
  • 1
    I'd recommend to not use WebClient but HttpClient. But for this question, it should still work. _"If I do send such a high volume of requests for hours at a time, am I at risk of having my IP address blacklisted and access denied?"_ - Maybe, yes. You'd probably want to throttle requests to less than, let's say a few per minute if not 1 / minute. _"Is it possible to monitor this JSON file continuously without sending a million requests?"_ you can limit the request count, but I doubt you can have it pushed to you. – Fildor May 14 '20 at 21:36
  • _"Can this be done using another language/framework?"_ Sure, but with the same restrictions on request frequency. – Fildor May 14 '20 at 21:37
  • 1) yes. depends on the site policy 2) if json is present as some js object on the page you can try to use [`Selenium`](https://stackoverflow.com/questions/6229769/execute-javascript-using-selenium-webdriver-in-c-sharp) – Guru Stron May 14 '20 at 21:53
  • What exactly are you wanting to do with the data? Also, do you know how often it's updated or when it is updated generally? – CodingKuma May 18 '20 at 05:20

1 Answers1

0

You can just add a sleep to the loop if you want to limit the number of calls:

Thread.Sleep(TimeSpan.FromSeconds(30));

This will sleep for 30 seconds in between calls but you can set it to whatever frequency you like. It isn't apparent from your code snippet, but if you are in an async method, you should use:

await Task.Delay(TimeSpan.FromSeconds(30));