0

Using this post I wrote code that checks 200 proxies for example. The timeout for a socket is 2sec. Everything is working, but the problem that Code #1 takes more than 2minutes to check 200 proxies limited to 2sec timeout. But with Code #2 it takes 2sec to check 200 proxies and it would take also 2sec to check 1000 proxies with Code #2.

Code #1 uses ThreadPool. Code #1 opens proxyCount sockets, goes to Sleep for 2sec and than checks what succeeded. It takes 2sec exactly.

So where is the problem in Code #1? Why ThreadPool with minimum 20 threads are much much slower than doing it without threads?

Code #1

int proxyCount = 200;  
CountdownEvent cde = new CountdownEvent(proxyCount);     
private void RefreshProxyIPs(object obj)
{     
    int workerThreads, ioThreads;
    ThreadPool.GetMinThreads(out workerThreads, out ioThreads);
    ThreadPool.SetMinThreads(20, ioThreads);

    var proxies = GetServersIPs(proxyCount);
    watch.Start();
    for (int i = 0; i < proxyCount; i++)
    {
        var proxy = proxies[i];
        ThreadPool.QueueUserWorkItem(CheckProxy, new IPEndPoint(IPAddress.Parse(proxy.IpAddress), proxy.Port));
    }
    cde.Wait();
    cde.Dispose();
    watch.Stop();
}

private List<IPEndPoint> list = new List<IPEndPoint>();
private void CheckProxy(object o)
{
     var proxy = o as IPEndPoint;
     using (var socket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp))
     {
         var asyncResult = socket.BeginConnect(proxy.Address, proxy.Port, null, null);
         if (asyncResult.AsyncWaitHandle.WaitOne(2000))
         {
             try
             {
                 socket.EndConnect(asyncResult);
             }
             catch (SocketException)
             {
             }
             catch (ObjectDisposedException)
             {
             }
         }
         if (socket.Connected)
         {
             list.Add(proxy);
             socket.Close();
         }
     }
     cde.Signal();
}

Code #2

int proxyCount = 200;
var sockets = new Socket[proxyCount];
var socketsResults = new IAsyncResult[proxyCount];
var proxies = GetServersIPs(proxyCount);
for (int i = 0; i < proxyCount; i++)
{
      var proxy = proxies[i];
      sockets[i] = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
      socketsResults[i] = sockets[i].BeginConnect(IPAddress.Parse(proxy.IpAddress), proxy.Port, null, proxy);             
}
Thread.Sleep(2000);
for (int i = 0; i < proxyCount; i++)
{
     var success = false;
     try
     {
         if (socketsResults[i].IsCompleted)
         {
              sockets[i].EndConnect(socketsResults[i]);
              success = sockets[i].Connected;
              sockets[i].Close();
         }

         sockets[i].Dispose();
     }
     catch { }

     var proxy = socketsResults[i].AsyncState as Proxy;
     if (success) {  _validProxies.Add(proxy); }
}
Community
  • 1
  • 1
theateist
  • 13,879
  • 17
  • 69
  • 109

2 Answers2

1

The threadpool threads you start are just not very good tp threads. They don't perform any real work but just block on the WaitOne() call. So 20 of them start executing right away and don't complete for 2 seconds. the threadpool scheduler only allows another thread to start when one of them completes or none of them complete within 0.5 seconds. It then allow an extra one to run. So it takes a while before all the requests are completed.

You could fix it by calling SetMinThreads() and setting the minimum to 200. But that's incredibly wasteful of system resources. You might as well call Socket.BeginConnect() 200 times and find out what happened 2 seconds later. Your fast version.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • You are totally right, but it is strage that the **Code #1** requires more then 2 minutes to be performed. 20-30 seconds - ok, but 2 minutes. Have you ideas why it is so? – ie. Jun 12 '12 at 14:32
  • It is at least 1.5 minutes, 180 x 0.5 seconds. A watched kettle takes longer to boil. – Hans Passant Jun 12 '12 at 14:39
  • 20 threads in the same time, 2 seconds each, so my calculations gives me 200(proxies) / 20 (threads) * 2 (seconds each) = 20 seconds – ie. Jun 12 '12 at 14:40
  • You are missing what I tried to explain, the TP scheduler only allows an *extra* thread to start when the running ones don't complete within 1/2 a second. – Hans Passant Jun 12 '12 at 14:45
  • @Hans, how did you calculated it? What is 180 number means? – theateist Jun 12 '12 at 14:45
  • @HansPassant that is strange, where can I read about it? if I replace method `CheckProxy` with the following one `private void CheckProxy(object o) { Thread.Sleep(100); cde.Signal(); }` the code works about 1 second, if I change sleep time to 2000 it works 20 seconds. – ie. Jun 12 '12 at 14:52
  • @HansPassant, what 180 number means? – theateist Jun 13 '12 at 10:07
0

Looks like in the first example, you're waiting for each proxy connection to timeout, or 2 seconds, whichever comes first. Plus, you're queuing up 200 separate work requests. Your thread pool size is probably going to be way less than this. Check it with GetMaxThreads. You're only going to have that number of work requests running concurrently, and the next request has to wait on a previous item to timeout.

Tim Coker
  • 6,484
  • 2
  • 31
  • 62