5

I remade this post because my title choice was horrible, sorry about that. My new post can be found here: After sending a lot, my send() call causes my program to stall completely. How is this possible?

Thank you very much everyone. The problem was that the clients are actually bots and they never read from the connections. (Feels foolish)

Community
  • 1
  • 1
returneax
  • 709
  • 1
  • 4
  • 18

3 Answers3

4

TCP_NODELAY might help latency of small packets from sender to receiver, but the description you gave points into different direction. I can imagine the following:

  • Sending more data than receivers actually consume - this eventually overflows sender's buffer (SO_SNDBUF) and causes the server process to appear "stuck" in the send(2) system call. At this point the kernel waits for the other end to acknowledge some of the outstanding data, but the receiver does not expect it, so it does not recv(2).

There are probably other explanations, but it's hard to tell without seeing the code.

Nikolai Fetissov
  • 82,306
  • 11
  • 110
  • 171
  • Hi Nikolai, I know for a fact that the clients are never reading the data. The clients are being simulated by another program. If they never receive data will that cause my own buffer to overflow? – returneax Dec 31 '10 at 05:56
  • On the second thought, it looks like UDP might fit your needs better. – Nikolai Fetissov Jan 01 '11 at 10:18
4

If send() is blocking on a TCP socket, it indicates that the send buffer is full, which in turn indicates that the peer on the other end of the connection isn't reading data fast enough. Maybe that client is completely stuck and not calling recv() often enough.

caf
  • 233,326
  • 40
  • 323
  • 462
2

Nagle's wouldn't cause "disappearing into the kernel", which is why disabling it doesn't help you. Nagle's will just buffer data for a little while, but will eventually send it without any prompting from the user.

There is some other culprit.


Edit for the updated question.

You must make sure that the client is receiving all of the sent data, and that it is receiving it quickly. Have each client write to a log or something to verify.

For example, if a client is waiting for the server to accept its 23-byte update, then it might not be receiving the data. That can cause the server's send buffer to fill-up, which would cause degradation and eventual deadlock.

If this is indeed the culprit, the solution would be some asynchronous communication, like Boost's Asio library.

chrisaycock
  • 36,470
  • 14
  • 88
  • 125
  • Yea, that's what I thought. I'm really clue-less though. Something I've been looking at is the tcp buffer size maybe... I really don't know -_- – returneax Dec 31 '10 at 05:28
  • @returneax Can you modify your question to include the code you're using to send/recv data on both the client and server? I wonder if you are doing something to cause a deadlock, etc. – chrisaycock Dec 31 '10 at 05:31
  • Sure, the client is written in C# and my friend is writing it but for now I can give you the server code. Also the server is multithreaded... – returneax Dec 31 '10 at 05:34
  • The multithreading is rather simple though. It's two threads which communicate through a message queue. The first thread is just listening on port 3500 for new connections. It then verifies their username and password creates a Client object and fills out its data, then passes a pointer to that object over the message queue as an array of bytes. So while no one is connecting the login thread is blocking on accept(). – returneax Dec 31 '10 at 05:42
  • @return I updated my answer, but now I see Nikolai beat me to it. – chrisaycock Dec 31 '10 at 06:09