I am troubleshooting a socket connection issue where a peer irregularly gets WSAETIMEDOUT
(10060) from socket send()
and I would like to understand the detail on the actual TCP level.
The actual implementation is done with Winsock blocking socket and has the following call pattern:
auto result = ::send(...);
if (result == SOCKET_ERROR)
{
auto err = ::WSAGetLastError();
// err can be WSAETIMEDOUT
}
As far as I understand the send
returns immediately if the outgoing data is copied to the kernel buffer [as asked in another SO].
On the other hand, I assume that the error (see Steffen Ullrich's answer)WSAETIMEDOUT
should be caused by missing TCP ACK
from the receiving side. Right?
What I am not sure is if such WSAETIMEDOUT
only happens when option SO_SNDTIMEO
is set.
The default value of SO_SNDTIMEO
is 0 for never timeout. Does it mean that an unsuccessful send would block forever? or is there any built-in/hard-coded timeout on Windows for such case?
And how TCP retransmission come into play?
I assume that unacknowledged packet would trigger retransmission. But what happen if all retransmission attempts fail? is the socket connection just stall? or WSAETIMEDOUT
would be raised (independent from SO_SNDTIMEO
)?
My assumption for my connection issue would be like this:
A current send
operation returns SOCKET_ERROR
and has error code with WSAETIMEDOUT
because the desired outgoing data cannot be copied to kernel buffer which is still occupied with old outgoing data which is either lost or cannot be ACK
ed from socket peer in time. Is my understanding right?
Possible causes may be: intermediate router drops packets, intermediate network just gets disconnected or peer has problem to receive. What else?
What can be wrong on receiving side? Maybe the peer application hangs and stops reading data from socket buffer. The receive buffer (on receiver side) is full and block sender to send data.
Thanks you for clarifying all my questions.