2

I have some URLs (5000) and I want to get them via HttpClient, I am looking for a way to do this as soon as possible.

I found the following codes from here

var client = new HttpClient();

//Start with a list of URLs
var urls = new string[]
    {
        "http://www.google.com",
        "http://www.bing.com"
    };

//Start requests for all of them
var requests  = urls.Select
    (
        url => client.GetAsync(url)
    ).ToList();

//Wait for all the requests to finish
await Task.WhenAll(requests);

//Get the responses
var responses = requests.Select
    (
        task => task.Result
    );

foreach (var r in responses)
{
    // Extract the message body
    var s = await r.Content.ReadAsStringAsync();
    Console.WriteLine(s);
}

Is this method suitable? Is there a faster and better way?

git test
  • 408
  • 1
  • 5
  • 11
  • 1
    "is it suitable?" it seems to do what it's supposed to. "is there a faster way?" hard to say - the important question: is it _fast enough_ for you? "is there a better way?" - completely depends what, for you, defines "better". – Franz Gleichmann Jun 11 '21 at 07:17
  • 1
    Outside the suboptimal usage of async, if this worked in the first place, you do realize, you'd start 5000 requests (nearly) _at once_? So you are basically flooding the network. – Fildor Jun 11 '21 at 07:17
  • @FranzGleichmann In general, I mean should I use parallel? – git test Jun 11 '21 at 07:20
  • 1
    Mixing Parallel with async is not a good idea. Maybe have a look into [DataFlow](https://learn.microsoft.com/en-us/dotnet/standard/parallel-programming/dataflow-task-parallel-library). – Fildor Jun 11 '21 at 07:25
  • 1
    Maybe also have a look at [Do not block on async code](https://blog.stephencleary.com/2012/07/dont-block-on-async-code.html) – Fildor Jun 11 '21 at 07:30
  • 1
    It is not that simple. So websites require https header, parameters on url (like id), and can require different User Agent header (browser type). – jdweng Jun 11 '21 at 08:53

0 Answers0