3

On a local gigabit network, I have an application using a single TCP server and many clients. Each client pings the server every 30 seconds, by opening a TCP connection, sending it a status message, and closing.

The server is set up using SocketAsyncEventArgs very similarly to the example shown HERE (omitted for brevity)

The clients initiate the connection using a TcpClient.

Relevant section of client code:

using (TcpClient client = new TcpClient())
{
     IAsyncResult ar = client.BeginConnect(address, port, null, null);
     if (!ar.AsyncWaitHandle.WaitOne(timeout))
     {
         throw new ApplicationException("Timed out waiting for connection to " + address);
     }
     client.EndConnect(ar); //exception thrown 5%-10% of the time

     //...send message and receive response...
 }

Everything works fine, except that on some machines, an exception is thrown only 5%-10% of the time on EndConnect.

The exception is a WSAEHOSTUNREACH (10065):

System.Net.Sockets.SocketException (0x80004005): A socket operation was attempted to an unreachable host 192.168.XXX.XXX:XXXX
at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult)
at System.Net.Sockets.TcpClient.EndConnect(IAsyncResult asyncResult)
  • The issue is definitely not congestion, this happens even when only one client is up and running, and at hours when network traffic is minimal.
  • I can see that EndConnect is being called very shortly after the call to BeginConnect, no time is spent inside ar.AsyncWaitHandle.WaitOne.

My question is how can I debug this type of error? The server is definitely up at this time.

Rotem
  • 21,452
  • 6
  • 62
  • 109
  • And the suggestions given by your favorite search engine did not turn out to be useful? There does not seem to be an unambiguous root cause for this, but there are lots of things to try. – usr Feb 18 '14 at 11:15
  • I am not one to ask an SO question before several hours of self searching. I did not find any useful information. If you did, I'd appreciate the links. – Rotem Feb 18 '14 at 11:25

1 Answers1

0

The problem seems to have been related to windows sleep mode. When the machine was asleep, it would generate these exceptions occasionally.

Disabling sleep mode using SetThreadExecutionState as outlined here seems to have taken care of the issue.

Still, I am not sure why I was getting SocketExceptions in this case. I could understand if the timer didn't fire at all, but not sure why connection would fail.

Community
  • 1
  • 1
Rotem
  • 21,452
  • 6
  • 62
  • 109