0

On a Debian bullseye Linux system which is possibly a bit more network-demanding than previous configurations, I send a constant status signal to a web relay in order to keep it powered. The following connect call frequently fails with EINPROGRESS, in pseudo-code:

while (1) {
    webrelay->socket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
    fcntl(webrelay->socket, F_SETFL, flags | O_NONBLOCK);
    setsockopt(webrelay->socket, IPPROTO_TCP, TCP_NODELAY, &one, sizeof(one))
    if (connect(webrelay->socket, (const struct sockaddr*)&relay_socket, sizeof(relay_socket)) < 0) {
        switch (errno) {
        case EINPROGRESS:
            while (1) {
                printf("connection slow \n");
                fds[0].fd = webrelay->socket;
                fds[0].events = POLLIN | POLLOUT;
                fds[0].revents = 0;
                if (poll(fds, sizeof(fds) / sizeof(fds[0]), 1) < 0) {
                    log_error("poll connect fail for webrelay %i\n", webrelay->webrelay_id);
                    goto close_socket;
                }
                if (fds[0].revents & (POLLIN | POLLOUT))
                    break;
            }
        default:
            goto close_socket;
        }
    }
    send(webrelay->socket, state_xml, sizeof(state_xml), MSG_DONTWAIT);
    recv(webrelay->socket, state_xml_reply, sizeof(state_xml_reply), MSG_DONTWAIT);
    shutdown(webrelay->socket, SHUT_RDWR);
    close(webrelay->socket);
    webrelay->socket = 0;
    usleep(1000);
    continue;
close_socket:
    close(webrelay->socket);
    webrelay->socket = 0;
}

What could be the issue? I increased the UDP and TCP buffers (net.core.wmem_max, net.core.wmem_default) to 26 214 400, as well net.ipv4.tcp_low_latency to 1, and net.ipv4.tcp_congestion_control to cubic, but that made no difference, and my web relay is powering off randomly like a schizophrenic because of this EINPROGRESS error.

There is a general background on EINPROGRESS, however that does not explain how to minimize this error which I did not previously have to worry about.

EDIT: setting the low_latency tcp variable noted above to 1 seemed to have improved things, but the error still persists.

Cheetaiean
  • 901
  • 1
  • 12
  • 26
  • 1
    Yes I added the close_socket label. There is no need to retry the connect as the EINPROGRESS error means it is in progress, hence only the while loop which polls the connection to see if it has succeeded yet. – Cheetaiean May 02 '23 at 18:42
  • 1
    See https://stackoverflow.com/a/17770524/3081018 on how to do deal with connect on non-blocking sockets. – Steffen Ullrich May 02 '23 at 18:46

1 Answers1

0

This "error" is a way to tell "hey, I'm still trying to connect, don't forget to ask me later". This is how non-blocking connect works.

"Solution" will be either to implement proper polling on connect(which I think you don't actually need, but you did it anyway) or simply do not set O_NONBLOCK for socket.

For more information you can read man 2 connect.

uis
  • 126
  • 1
  • 9