How to reconnect to a socket gracefully

Question

I have a following method that connects to an end point when my program starts

ChannelSocket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
var remoteIpAddress = IPAddress.Parse(ChannelIp);
ChannelEndPoint = new IPEndPoint(remoteIpAddress, ChannelPort);
ChannelSocket.Connect(ChannelEndPoint);

I also have a timer that is set to trigger every 60 seconds to call CheckConnectivity, that attempts to send an arbitrary byte array to the end point to make sure that the connection is still alive, and if the send fails, it will attempt to reconnect.

public bool CheckConnectivity(bool isReconnect)
{
    if (ChannelSocket != null)
    {
        var blockingState = ChannelSocket.Blocking;
        try
        {
            var tmp = new byte[] { 0 };
            ChannelSocket.Blocking = false;
            ChannelSocket.Send(tmp);
        }
        catch (SocketException e)
        {
            try
            {
                ReconnectChannel();
            }
            catch (Exception ex)
            {
                return false;
            }
        }
    }
    else
    {
        ConnectivityLog.Warn(string.Format("{0}:{1} is null!", ChannelIp, ChannelPort));
        return false;
    }

    return true;
} 

private void ReconnectChannel()
{
    try
    {
        ChannelSocket.Shutdown(SocketShutdown.Both);
        ChannelSocket.Disconnect(true);
        ChannelSocket.Close();
    }
    catch (Exception ex)
    {
        ConnectivityLog.Error(ex);
    }

    ChannelSocket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
    var remoteIpAddress = IPAddress.Parse(ChannelIp);
    ChannelEndPoint = new IPEndPoint(remoteIpAddress, ChannelPort);
    ChannelSocket.Connect(ChannelEndPoint);
    Thread.Sleep(1000);

    if (ChannelSocket.Connected)
    {
        ConnectivityLog.Info(string.Format("{0}:{1} is reconnected!", ChannelIp, ChannelPort));
    }
    else
    {
        ConnectivityLog.Warn(string.Format("{0}:{1} failed to reconnect!", ChannelIp, ChannelPort));
    }
}

So how I'd test the above, is to physically unplug the LAN cable from my ethernet device, allowing my code to attempt to reconnect (which fails obviously) and reconnect back the LAN cable.

However, even after reconnecting the LAN cable (able to ping), ChannelSocket.Connect(ChannelEndPoint) in my Reconnect method always throws this error

No connection could be made because the target machine actively refused it 192.168.168.160:4001

If I were to restart my whole application, it connects successfully. How can I tweak my reconnect method such that I don't have to restart my application to reconnect back to my Ethernet device?

Finding it hard to read your code, what is ChannelSocket for example? Anyway, TCP/IP reconnection is a bit more complicated than what your programming is geared for. It would be best if you used some existing TCP/IP package that includes that service. Maybe WCF? I have done this in a home-made communications package for Windows server to Android client, and it involves about 1000 lines of C# and Java code, and I'm still not 100% satisfied with it. — RenniePet, Nov 28 '14 at 07:39
good question, that also happens in our application very rarely and we haven't implemented any solution yet. Hope we get an answer. — Adrian Nasui, Nov 28 '14 at 16:47
Once you get an exception on the `Socket`, I would not try any other operations on it (e.g. `Shutdown()`, `Disconnect()`). Just close it. Also, don't use any sort of "keep-alive" unless you absolutely have to. TCP is resilient to network interruptions, but only if they occur when not trying to send data (from either end). Mostly all that a "keep-alive" implementation does is to greatly increase the chances that an error will be detected; i.e. it has the net effect of significantly reducing reliability. — Peter Duniho, Nov 28 '14 at 23:27

Vikas Gupta · Accepted Answer · 2014-12-02T07:55:34.107

If an application closes a TCP/IP port, the protocol dictates that the port stays in TIME_WAIT state for a certain duration (default of 240 seconds on a windows machine). See following for references -

http://en.wikipedia.org/wiki/Transmission_Control_Protocol

http://support.microsoft.com/kb/137984

http://www.pctools.com/guides/registry/detail/878/

What this means for your scenario - is that you cannot expect to close (willingly or unwillingly) and re-open a port within a short period of time (even several seconds). Despite some registry tweaks which you'd find on internet.. the port will be un-available for any app on windows, for a minimum of 30 seconds. (Again, default is 240 seconds)

Your options - here are limited...

From the documentation at http://msdn.microsoft.com/en-us/library/4xzx2d41(v=vs.110).aspx -

"If the socket has been previously disconnected, then you cannot use this (Connect) method to restore the connection. Use one of the asynchronous BeginConnect methods to reconnect. This is a limitation of the underlying provider."

The reason why documentation suggests that BeginConnect must be used is what I mentioned above.. It simply doesn't expect to be able to establish the connection right away.. and hence the only option is to make the call asynchronously, and while you wait for the connection to get established in several minutes, do expect and plan for it to fail. Essentially, likely not an ideal option.

If the long wait and uncertainty is not acceptable, then your other option is to somehow negotiate a different port between the client and server. (For example, in theory you could use UDP, which is connectionless, to negotiate the new TCP port you'd re-establish the connection on). Communication using UDP, in theory of course, itself is not guaranteed by design. But should work most of the times (Today, networking in typical org is not that flaky / unreliable). Subjective to scenario / opinion, perhaps better than option 1, but more work and smaller but finite chance of not working.
As suggested in one of the comments, this is where application layer protocols like http and http services have an advantage. Use them, instead of low level sockets, if you can. If acceptable, this is the best option to go with.

(PS - FYI - For HTTP, there is a lot of special handling built into OS, including windows - For example, there is a dedicated driver Http.sys, specially for dealing with multiple apps trying to listen on same port 80 etc.. The details here are a topic for another time.. point is, there is lots of goodness and hard work done for you, when it comes to HTTP)

How would I use a http service to connect to my device that listens only on say.. port 4000 for example? Are there any examples? — Null Reference, Dec 03 '14 at 01:29
if device listens only on port 4000, then http is not an option.. In fact, if you do not control the server / service, then I am not sure what you can do about it. because if port 4000 is already taken, or gets closed for some reason.. you can only hope that it becomes available again at a later point in time as server keeps trying to open that specific port for listening. — Vikas Gupta, Dec 03 '14 at 01:51
Yes, I have no control over my device. Looks like option 3 is out — Null Reference, Dec 03 '14 at 02:04
Just to ask, if there's a wait out period of 240 seconds, why does restarting my application immediately reconnects back to my device? — Null Reference, Dec 03 '14 at 14:16
I would recommend using a network monitoring tool like (If on Windows) `ProcMon` to confirm what's going on at the lower level.. It will be even more helpful if you could share the traces from these tools, in your question. FYI, this is a Microsoft tool, and be sure to filter the capture for Networking activity.. Take traces for both scenarios.. i.e. trying to reconnect, as well as connection on fresh start. — Vikas Gupta, Dec 03 '14 at 16:40
@VikasGupta I think you need to clarify which port cannot be used within 240 seconds. I would believe this is the client port. If as the OP says he needs to connect to port 4000 - it is port 4000 on the server not on his client connection, hence the 240 delay would not apply in this case. Also, if you close the application and launch it, it will connect, which leads one to believe that windows is possibly only waiting for 240 seconds on the the application Process - port combination. — Charles Okwuagwu, Sep 28 '15 at 09:14

score 5 · Answer 2 · answered Dec 02 '14 at 08:28

Maybe you should switch to a higher abstraction class, which better deals with all these nifty little details?

I'm going to use for these network connections the TcpListener and TcpClient classes. The usage of these classes is quite easy:

The client side:

public void GetInformationAsync(IPAddress ipAddress)
{
    _Log.Info("Start retrieving informations from address " + ipAddress + ".");
    var tcpClient = new TcpClient();
    tcpClient.BeginConnect(ipAddress, _PortNumber, OnTcpClientConnected, tcpClient);
}

private void OnTcpClientConnected(IAsyncResult asyncResult)
{
    try
    {
        using (var tcpClient = (TcpClient)asyncResult.AsyncState)
        {
            tcpClient.EndConnect(asyncResult);
            var ipAddress = ((IPEndPoint)tcpClient.Client.RemoteEndPoint).Address;
            var stream = tcpClient.GetStream();
            stream.ReadTimeout = 5000;
            _Log.Debug("Connection established to " + ipAddress + ".");

            var formatter = new BinaryFormatter();
            var information = (MyInformation)formatter.Deserialize(stream);

            _Log.Info("Successfully retrieved information from address " + ipAddress + ".");
            InformationAvailable.FireEvent(this, new InformationEventArgs(information));
        }
    }
    catch (Exception ex)
    {
        _Log.Error("Error in retrieving informations.", ex);
        return;
    }
}

The server side:

public void Start()
{
    ThrowIfDisposed();

    if (_TcpServer != null;)
        _TcpServer.Stop();

    _TcpServer = new TcpListener(IPAddress.Any, _PortNumber);
    _TcpServer.Start();

    _TcpServer.BeginAcceptTcpClient(OnClientConnected, _TcpServer);
    _Log.Info("Start listening for incoming connections on " + _TcpServer.LocalEndpoint + ".");
}

private void OnClientConnected(IAsyncResult asyncResult)
{
    var tcpServer = (TcpListener)asyncResult.AsyncState;
    IPAddress address = IPAddress.None;

    try
    {
        if (tcpServer.Server != null
            && tcpServer.Server.IsBound)
            tcpServer.BeginAcceptTcpClient(OnClientConnected, tcpServer);

        using (var client = tcpServer.EndAcceptTcpClient(asyncResult))
        {
            address = ((IPEndPoint)client.Client.RemoteEndPoint).Address;
            _Log.Debug("Client connected from address " + address + ".");

            var formatter = new BinaryFormatter();
            var informations = new MyInformation()
            {
                // Initialize properties with desired values.
            };

            var stream = client.GetStream();
            formatter.Serialize(stream, description);

            _Log.Debug("Sucessfully serialized information into network stream.");
        }
    }
    catch (ObjectDisposedException)
    {
        // This normally happens, when the server will be stopped
        // and their exists no other reliable way to check this state
        // before calling EndAcceptTcpClient().
    }
    catch (Exception ex)
    {
        _Log.Error(String.Format("Cannot send instance information to {0}.", address), ex);
    }
}

This code works and doesn't make any problems with a lost connection on the client side. If you have a lost connection on the server side you have to re-establish the listener, but that's another story.

I'll could try this out. However, the "server" side is an ethernet device that I have no control over. My program is a client that connects to the device to send/receive data — Null Reference, Dec 03 '14 at 01:31
@NullReferenceException: I just added the server side to provide a *complete* example. Also you can test with it if it is really you that makes something wrong or if the device behaves strange. — Oliver, Dec 03 '14 at 07:18

score 3 · Answer 3 · edited Nov 29 '17 at 06:39

3

In ReconnectChannel just dispose the ChannelSocket object.

try
    {
     `//ChannelSocket.Shutdown(SocketShutdown.Both);
        //ChannelSocket.Disconnect(true);
        //ChannelSocket.Close();
        ChannelSocket.Dispose();`   
    }

This is working for me. Let me know if it doesn't work for you.

edited Nov 29 '17 at 06:39

AbhayBohra

2,047
24
36

answered Nov 29 '17 at 04:41

Venkata Chary Bhairoju

31
1

How to reconnect to a socket gracefully

3 Answers3

Linked