1

I'm dealing right now with an issue, where I don't know the right/best solution to.

Consider the following example:

Imagine you have one Socket, like this:

SOCKET s = socket(AF_INET,SOCK_DGRAM,IPPROTO_UDP);

On this socket, which I will refer to as "ServerSocket", there are incoming many udp packets from many different ip's+port's (clients).

Since it seems not a good idea to create multiple threads blocking in a recvfrom() on this socket, I came to the idea that (maybe) one dedicated thread, that just blocks on recvfrom() puts those ip+port+msg combinations into some kind of "global queue" (std::queue, guarded by a mutex).

So far, so well.

I know about IOCP and the first question about it is: Does it make sense to use IOCP for that kind of problem / on one socket? I came to the problem that, even if the UDP packets (which we all know is not guaranteed by the protocol itself) come in on the socket in the right order, there will be the issue of thread-ordering. For example, if I'd use IOCP with four threads and four outstanding overlapped wsarecvfrom(), package 1 2 3 4 might be reordered by the thread sheduler e.g. to 3 4 1 2. If one uses only one outstanding wsarecvfrom(), everything works as expected, because there is just one thread at a time handling the wsarecvfrom(), putting that message into the clients queue and posting the next overlapped wsarecvfrom().

Furthermore, I'd like to emulate functions like recvmsg() and sendmsg() in blocking mode, but the problem here is, e.g. if you have thousands of clients, you can not open 1000's of threads which all have their dedicated recvmsg() blocking on e.g. a condition variable of a clients message queue. This is an issue as well, since clients might get deleted, by receiving a package, which might contain something like "CLOSE_CONNECTION", to emulate closesocket() like TCP uses it.

I need to use UDP, because the data the user sends is time critical, but it doesn't have to be reliable; only the status messages should be as reliable as possible, like e.g. "CONNECT_REQUEST", if a client "connects" (like tcp does it, which we all know, udp doesn't do, so we have to write it ourselfs, if necessary). In-order for client-messages would be needed as well.

To sum this all up, the following criteria should be given: - In-order messages for the client's message part is needed - reliability for client's messages is NOT necessary (only for the status packages, like "ACK_PACKAGE" etc. ... we're talking about newest message > important than reliability of having the message received) - many clients have to be managed and things like disconnections (soft/hard, like if a client plugs the networkcable or something ...) have to be detected (threadpool of timers?)

So my final question will be: What is the best approach to reach a goal like that? With TCP, it would be easier, because one IOCP thread could listen to one accept()ed TCP socket, so there wouldn't be that thread reordering problem. With one UDP socket, you can't do it that way, so maybe there must be something like overlapped request, but just for one ... well, "self defined" event.

ROMANIA_engineer
  • 54,432
  • 29
  • 203
  • 199
fludd
  • 53
  • 1
  • 6
  • There's nothing wrong with the multiple-thread solution that you discarded right at the beginning. It's not the only way to so it, but it's a pretty easy way. – user207421 Apr 09 '14 at 17:51
  • `void thread_one() { recvfrom(serverSocket....); } void thread_two() { recvfrom(serverSocket....); }` like that?? – fludd Apr 09 '14 at 21:13
  • Certainly. Only one thread will receive each message, and you can send datagrams via the same socket from multiple threads too. The send() and recv() calls are system calls and therefore atomic. – user207421 Apr 10 '14 at 01:10

1 Answers1

2

You're correct in that an IOCP based server using multiple threads to service the IOCP can and will require explicit sequencing to ensure that the results from multiple concurrent reads are processed in the correct sequence. This is equally true of TCP connections (see here).

The way that I usually deal with this problem with TCP is to have a per connection counter which is a value added as meta-data to each buffer used for a recv on that connection. You then simply ensure that the buffers are processed in sequence as the sequence of issued reads is the sequence of read completions out of the IOCP (it's just the scheduling of the multiple threads reading from the IOCP that causes the problem).

You can't take this approach with UDP if you have a single 'well known port' that all peers send to as your sequence numbers have no 'connection' to be associated with.

In addition, an added complication with UDP is that the routers between you and your peer may contrive to resequence or duplicate any datagrams before they get to you anyway. It's unlikely but if you don't take it into account then that's bound to be the first thing that happens when you're demoing it to someone important...

This leads to the fact that to sequence UDP you need a sequence number inside the data portion of the datagram. You then get the problem where UDP datagrams can be lost and so that sequence number is less useful for ensuring all inbound data is processed in sequence and only useful in ensuring that you never process any datagrams out of sequence. That is, if you have a sequence number in your datagram all you can do with it is make sure you never process a datagram from that peer with a sequence number less than or equal to the one you last processed (in effect you need to discard potentially valid data).

This is actually the same problem you'd have with a single threaded system with a single peer, though you'd likely get away without being this strict right up until the important demo when you get a network configuration that happens to result in duplicate datagrams or out of sequence datagrams (both quite legal).

To get more reliability out of the system you need to build more of a protocol on top of UDP. Perhaps take a look at this question and the answers to it. And then be careful not to build something slower and less good and less polite to other network users than TCP...

Community
  • 1
  • 1
Len Holgate
  • 21,282
  • 4
  • 45
  • 92
  • ok, so the trick seems to be something like ... "you can use IOCP, but you should put a criticalsection around " sequenceNumber++; Wsarecvfrom(...);". using this way guarantees that we have multiple wsarecvfrom's outstanding, which is good for the reason of performance. it further guarantees, that the next buffer will be processed in order. is that correct? – fludd Apr 09 '14 at 21:31
  • Yes, that might help, it would ensure that the UDP stream is processed in order for all connected peers. It would allow you to have multiple IOCP threads and you'd then need some way to synchronise and sequence datagrams from a specific peer. It's probably also worth using GetQueuedCompletionStatusEx() to dequeue multiple completions at a time with a single thread... It's not ideal for overall performance because of the cross-peer synchronisation. Far better to have a sequence number embedded in the datagram itself and simply discard 'stale' data as I explained above. – Len Holgate Apr 10 '14 at 07:03
  • so the best solution might be a sequence number within a datagram? what about a map or an unordered_map to insert the packages by sequence number per client? how do you deal with a list of clients shared between multiple threads? what would you use in that case? – fludd Apr 10 '14 at 07:33
  • Fine for TCP where you know you'll get all the datagrams. But with UDP how long do you wait to see if datagram N-1 arrives when you already have N? – Len Holgate Apr 10 '14 at 09:16
  • i see your point. because of the unreliable nature of udp in many cases, there might be a case where something like this happens: DGRAM 1234 Arrives, DGRAM 1234 Arrives (again, since duplication of dgram's might occur) DGRAM 1236 (since packages can get lost). if i am not totally wrong here, if i wouldn't care about reliability of packages, only about the right order of the buffers when the packages arrived, there has to be some kind of synchronization where the first package of an ip + port combination arrives has to be put into the first buffer of that client ... etc.? – fludd Apr 10 '14 at 09:48
  • Well, the simplest is just a lock on the 'per peer' data and a counter that represents last processed seq. Your thread then locks the peer, checks the last seq to the current datagram and either processes or discards. The downside of this simplistic approach is that lots of datagrams from one peer arriving in sequence will cause your I/O threads to block each other. – Len Holgate Apr 10 '14 at 11:34
  • A better approach is to have a locks, a list of queued dgrams, a list of processing dgrams and a processing flag per peer. Then you lock the peer, add your dgram to the queue list, see if anyone is processing, if not, swap the queue list with the 'processing list' (currently empty), set the processing flag and unlock the peer. Now you process the 'processing list' All other I/O threads will simply add to the 'queue list'. – Len Holgate Apr 10 '14 at 11:37
  • When you are done with the 'processing list' lock the peer, swap queue list with empty processing list, see if the processing list is empty, if it is unset processing flag and unlock, if not unlock and process and loop back to here once you're done. This keeps one thread processing a single peer (which is good for data locality and stops the I/O threads blocking for as long. Ideally the 'list' of datagrams is just a singly linked chain of the buffers that you issued your reads into (each buffer has a 'next' pointer in it). – Len Holgate Apr 10 '14 at 11:41