3

My program has an established tcp connection when linux box loses its DHCP IP address lease. After that it tries to close the connection so when dhcp server recovers it will re-establish tcp connection again. It uses SO_REUSEADDR. I did read this http://hea-www.harvard.edu/~fine/Tech/addrinuse.html but in this application reuse address is a requirement. The way I reproduce this problem is by issuing ifconfig etho 0.0.0.0

However, the result of close(sockfd) is unpredictable. Sometimes it closes socket properly. Sometimes netstat -ant continuously shows tcp 0 0 192.168.1.119:54322 192.168.1.41:54321 (STATE) where (STATE) can one of ESTABLISHED, or FIN_WAIT1, or CLOSE_WAIT.

Originally my code had just close(). After reading multiple sources online, I tried some suggestions.

First I tried this (based on http://deepix.github.io/2016/10/21/tcprst.html)

   if (sockFd != -1) {
        linger lin;
        lin.l_onoff = 1;
        lin.l_linger = 0;
        if (setsockopt(sockFd, SOL_SOCKET, SO_LINGER, (const char *)&lin, sizeof(linger)) == -1) {
            std::cout << "Error setting socket opt SO_LINGER while trying to close " << std::endl;
        }
        close(sockFd);
    }

It did not help, so I tried this (based on close() is not closing socket properly )

bool haveInput(int fd, double timeout) {
   int status;
   fd_set fds;
   struct timeval tv;
   FD_ZERO(&fds);
   FD_SET(fd, &fds);
   tv.tv_sec  = (long)timeout; // cast needed for C++
   tv.tv_usec = (long)((timeout - tv.tv_sec) * 1000000); // 'suseconds_t'

   while (1) {
      if (!(status = select(fd + 1, &fds, 0, 0, &tv)))
         return FALSE;
      else if (status > 0 && FD_ISSET(fd, &fds))
         return TRUE;
      else if (status > 0)
          break;
      else if (errno != EINTR)
          break;
   }
}

void myClose(int sockFd)
{
   if (sockFd != -1) {
int err = 1;
   socklen_t len = sizeof err;
   getsockopt(sockFd, SOL_SOCKET, SO_ERROR, (char *)&err, &len);
        shutdown(sockFd, SHUT_WR);
        usleep(20000);
        char discard[99];
         while (haveInput(sockFd, 0.01)) 
            if (!read(sockFd, discard, sizeof discard))
    break;
        shutdown(sockFd, SHUT_RD);
        usleep(20000);
        close(sockFd);
        sockFd = -1;
    }  
}

As before, sometimes it closes connection, and sometimes it does not. I understand that in this case the other side can send neither FIN nor ACK, so graceful close is just not possible.

Is there a reliable way to completely close tcp connection in such conditions?

Thank you

dilo
  • 31
  • 2
  • SO_LINGER/`close()` should certainly have helped, as it resets the connection, so you can't possibly have seen ESTABLISHED, FIN_WAIT_1, CLOSE_WAIT, etc. `shutdown()` adds no value to `close()` here; nor does reading until end of stream: if there was any data there a simple `close()` would already have reset the connection, which again you don't want to do. The problem is that if the interface has lost its IP address, it can't send or receive FIN or ACK, so the connection cannot be properly closed, so FIN_WAIT_1 is inevitable, but it should time out eventually. CLOSE_WAIT is an application bug. – user207421 Apr 02 '20 at 06:36
  • NB Sleeps are just literally a waste of time in networking code, and you don't need `select()` to implement a read timeout: `SO_RCVTIMEO` already does that far more simply. – user207421 Apr 02 '20 at 06:42
  • @user207421: I agree with everything what you said. However, I verified that with SO_LINGER/close connection disappears most of the time, but still sometimes it stays in ESTABLISHED mode for a very long time (I did not have enough patience to see for how long). – dilo Apr 04 '20 at 02:16
  • So `close()` must have returned an error. I hope your real code has real error-checking. – user207421 Apr 04 '20 at 05:48

0 Answers0