0

I'm not really a developer and only just coded during university and for fun. I'm currently working on a small project involved in web scraping with c# and I can't seem to download the HTML code from the website I want, see code below.

private asynch void GetHTML1()
    {
        //Crownbet
        string crownbet_url = "https://crownbet.com.au/sports-betting/australian-rules/afl/afl-matches/";
        HttpClient crownbet_httpClient = new HttpClient();
        var crownbet_html = await crownbet_httpClient.GetStringAsync(crownbet_url).Result;
        HtmlAgilityPack.HtmlDocument crownbet_htmlDocument = new HtmlAgilityPack.HtmlDocument();

        crownbet_htmlDocument.LoadHtml(crownbet_html);

    }

When I run this code I get the following error.

Exception thrown:

'System.Net.WebException' in System.dll An unhandled exception of type 'System.Net.WebException' occurred in System.dll The remote server returned an error: (429) Calm down.

I'm not too sure what's wrong here because I've used this method to grab HTML code from other websites and it seems to work fine. Status 429 is usually given when you send too many requests to a website right?

sujith karivelil
  • 28,671
  • 6
  • 55
  • 88
bobyang
  • 1
  • 1
  • Assuming you're saying you haven't made many requests (so the response is wrong), try adding the `User-Agent` of a common browser. The site might see that the request isn't from a browser (determined by missing or changed headers) and sends this to try and prevent scraping – ProgrammingLlama Jul 04 '18 at 05:28
  • I just googled the UserAgent property and it seems like it only belongs to the HttpRequest and HttpWebRequest objects and I'm not too sure how I can use this property in HttpClient? – bobyang Jul 04 '18 at 05:39
  • See [here](https://stackoverflow.com/questions/44076962/how-do-i-set-a-default-user-agent-on-an-httpclient). I'm not saying this will solve your problem, but it's worth a try. – ProgrammingLlama Jul 04 '18 at 05:43
  • I've added in the following code `crownbet_httpClient.DefaultRequestHeaders.Add("User-Agent", "Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko");` But when i run this i get WebException: The server committed a protocol violation. Section=ResponseHeader Detail=CR must be followed by LF – bobyang Jul 04 '18 at 06:47

0 Answers0