20

I wrote a code to check urls, however, ir works really slow.. I want to try to make it work on few urls at the same time, for example 10 urls or at least make it as fast as possible.

my Code:

Parallel.ForEach(urls, new ParallelOptions {
  MaxDegreeOfParallelism = 10
}, s => {
  try {
    using(HttpRequest httpRequest = new HttpRequest()) {
      httpRequest.UserAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0";
      httpRequest.Cookies = new CookieDictionary(false);
      httpRequest.ConnectTimeout = 10000;
      httpRequest.ReadWriteTimeout = 10000;
      httpRequest.KeepAlive = true;
      httpRequest.IgnoreProtocolErrors = true;
      string check = httpRequest.Get(s + "'", null).ToString();
      if (errors.Any(new Func < string, bool > (check.Contains))) {
        Valid.Add(s);
        Console.WriteLine(s);
        File.WriteAllLines(Environment.CurrentDirectory + "/Good.txt", Valid);
      }
    }
  } catch {

  }
});
Pietro Nadalini
  • 1,722
  • 3
  • 13
  • 32
Ariel
  • 221
  • 1
  • 2
  • 5
  • 4
    Also, just as an aside, some web servers may not process your requests in parallel (because it might look like a DoS attack, or it just limits the number of connections from an IP). Just because you make 10 requests in parallel doesn't mean the web server will return data to you in parallel. It may still return data back to you as if you sent them like Send request -> receive response -> send -> receive, and so on. – KSib Sep 14 '18 at 19:32
  • MaxDegreeOfParallelism considers your machine's processing power and not the number of records in the collection. If you have dual core, it will process 2 records in parallel. On the other hand, browsers can send more requests in parallel but unfortunately not the server. – Rohit Ramname Sep 14 '18 at 19:35
  • so how can I make it faster? how other tools does it really fast? even tools that does the same thing I did – Ariel Sep 14 '18 at 19:40
  • @RohitRamname How much processing power will be consumed by **waiting** for 10 HTTP GET? Let me guess: zero? – Sir Rufo Sep 14 '18 at 19:41
  • @SirRufo, I guess so too. I also could use the solution for this issue. – Rohit Ramname Sep 14 '18 at 20:01
  • so... what can I do to make it work faster? – Ariel Sep 14 '18 at 20:12

2 Answers2

41

It is unlikely that your service calls are CPU-bound. So spinning up more threads to handle the load is maybe not the best approach-- you will get better throughput if you use async and await instead, if you can, using the more modern HttpClient instead of HttpRequest or HttpWebRequest.

Here is an example of how to do it:

var client = new HttpClient();

//Start with a list of URLs
var urls = new string[]
    {
        "http://www.google.com",
        "http://www.bing.com"
    };

//Start requests for all of them
var requests  = urls.Select
    (
        url => client.GetAsync(url)
    ).ToList();

//Wait for all the requests to finish
await Task.WhenAll(requests);

//Get the responses
var responses = requests.Select
    (
        task => task.Result
    );

foreach (var r in responses)
{
    // Extract the message body
    var s = await r.Content.ReadAsStringAsync();
    Console.WriteLine(s);
}
John M
  • 2,510
  • 6
  • 23
  • 31
John Wu
  • 50,556
  • 8
  • 44
  • 80
  • Thanks, I will try it – Ariel Sep 14 '18 at 20:49
  • 3
    Will this work with POST requests too ? If so could you please let me know how, thanks. – user5381191 Dec 16 '19 at 18:17
  • For some reason I kept running into request message was already sent cannot re-send. Instead of `string[]` I created a `List` and then `var requests = httpRequestMessages.Select(h => Client.SendAsync(h))` – user5381191 Dec 16 '19 at 19:24
  • Thankyou for the answer @john, I am having socket exception by using this approach `System.IO.IOException: Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.` Any idea? – Hamza Khanzada Apr 16 '20 at 13:30
  • 1
    @HamzaKhanzada Sounds like a network problem, or perhaps an issue with the service; most likely it is not related to the client code. – John Wu Apr 16 '20 at 17:06
  • yes, but how can I handle this in client code? if in case the network or the target server doesn't respond? – Hamza Khanzada Apr 17 '20 at 05:04
  • @HamzaKhanzada I am not sure I understand the question. As with any exception, you `catch` it and handle it in whatever way is dictated by your requirements. – John Wu Apr 17 '20 at 06:12
  • Will this approach work with a POST request? I am wanting to send multiple files to the API on a single request. I implemented this approach with a POST but my API is not getting the request. – Michael Brown Dec 21 '20 at 20:36
  • 1
    @MichaelBrown This question is about sending multiple requests, not sending a single request with multiple files (which is considerably different). I suggest you ask a separate question. – John Wu Dec 21 '20 at 23:20
-1

Try doing as below.

Parallel.ForEach(urls, new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount - 1 }

At least it makes sure that all the cores are used by leaving 1 so that your machine will not run out of memory.

Also, consider @KSib comment.

Rohit Ramname
  • 824
  • 2
  • 9
  • 24