2

I have a multithreaded network application using a UdpClient, TcpClient, TcpListener, and processing received connections and received data using the e.g. BeginReceive() EndReceive() callback pattern.

Using the UdpClient as an example, in this pattern the general flow of work I am using is:

  1. Call UdpClient.BeginReceive()
  2. The receive callback is executed when a datagram is received.
  3. Call UdpClient.EndReceive() to collect the datagram.
  4. Call UdpClient.BeginReceive() again to prepare to receive another datagram.
  5. Process the datagram received at (3).
  6. Repeat 2 - 5 as more datagrams are received

Q: Since there is only a single UdpClient object, and due to the pattern of always calling EndReceive() before the next BeginReceive(), is it necessary to lock/synchronise acccess to the UdpClient object for those calls?

It seems to me that it wouldn't be possible for another thread to interfere in this workflow or make those calls non-atomic. The pattern for TcpClient.BeginReceive() and TcpListener.BeginAcceptTcpClient() are very similar.

Bonus Q: Does the single UdpClient object need to be declared static (and static lock object if one is required)?

Note: I am not asking if it is neccessary to perform any locking during e.g. datagram processing. Only regarding this pattern and the UdpClient TcpClient TcpListener objects.


EDIT

By way of clarification, (ignoring exception handling) is this code:

private void InitUDP()
{
    udpclient = new UdpClient(new IPEndPoint(IPAddress.Any, Settings.Port));

    udpclient.BeginReceive(new AsyncCallback(receiveCallback), udpclient);
}

private void receiveCallback(IAsyncResult ar)
{
    UdpClient client = (UdpClient)ar.AsyncState;

    IPEndPoint ep = new IPEndPoint(IPAddress.Any, 0);

    byte[] datagram = client.EndReceive(ar, ref ep);

    udpclient.BeginReceive(new AsyncCallback(receiveCallback), udpclient);

    processDatagram();
}

Practically different or less-protective than this code:

private void InitUDP()
{
    udpclient = new UdpClient(new IPEndPoint(IPAddress.Any, Settings.Port));

    udpclient.BeginReceive(new AsyncCallback(receiveCallback), udpclient);
}

private void receiveCallback(IAsyncResult ar)
{
    UdpClient client = (UdpClient)ar.AsyncState;

    IPEndPoint ep = new IPEndPoint(IPAddress.Any, 0);

    lock(_lock)
    {
        byte[] datagram = client.EndReceive(ar, ref ep);

        udpclient.BeginReceive(new AsyncCallback(receiveCallback), udpclient);
    }

    processDatagram();
}
khargoosh
  • 1,450
  • 15
  • 40
  • `UdpClient` member functions are not thread safe, so if multiple threads access the same instance then you have a problem. I'm not sure if that is/was your question though. – MicroVirus Oct 14 '15 at 00:07
  • That's the question though - is it possible for another thread to access the same instance through these callbacks, if `EndReceive()` is always called before the next `BeginReceive()` ? Do I need to lock those calls? – khargoosh Oct 14 '15 at 00:10
  • You're the one in control of the threads, so you decide what thread accesses what. So is it possible? Yes, if the callbacks can legally access the instance, such as when it is public or the callbacks are members of the same instance -- in a way, if you can write the code to access the instance in the callback and it compiles then, yes, you can access it. But unless I misunderstood, that's not the question you actually want answered, because, like I said, you decide what each callback/thread does, right? – MicroVirus Oct 14 '15 at 00:16
  • Here's another SO answer showing how you can [use `BeginReceive` and `EndReceive`](http://stackoverflow.com/a/1388691/2718186). It also talks about using Tasks as a way to simplify the code, which also makes it clear that there is no threading going on in your code/callbacks. – MicroVirus Oct 14 '15 at 09:25

1 Answers1

3

is it necessary to lock/synchronise acccess to the UdpClient object for those calls?

No, not exactly, but perhaps not for the reason you might think.

If you call BeginReceiveFrom() (or just BeginReceive() for that matter) before you are done processing the current datagram, it is in fact possible for the same callback to be called concurrently. Whether this actually will happen depends on a lot of things, including thread scheduling, how many IOCP threads are currently available in the thread pool, and of course whether there is even a datagram to receive.

So you definitely have the risk that before you are done processing the current datagram, the receipt of a new datagram will occur and processing for that will commence before the processing of the first one is done.

Now, if the processing of datagrams involves the access of some other shared data, then you definitely do need synchronization around that other shared data, to ensure safe access of that other data.

But as far as the datagram itself goes, the networking objects are thread-safe in the sense that you won't corrupt the object using it concurrently…it's still up to you to ensure you use them in a coherent way. But with the UDP protocol in particular, this is easier than for TCP.

UDP is unreliable. It has the lack of three very important guarantees:

  1. There is no guarantee a datagram will be delivered.
  2. There is no guarantee a datagram is not delivered more than once.
  3. There is no guarantee a datagram will be delivered in the same order relative to other datagrams in which it was sent.

That last point is particularly relevant here. Your code already needs to be able to handle the processing of datagrams out of order. So whether this disordering of datagrams occurs because of the network itself or because you started a new I/O operation before you were done processing the current one, your code if written correctly will successfully deal with it.


With TCP, things are different. You again have the same issue that if you've started an I/O operation, it most certainly could complete before you are done processing the current I/O operation. But unlike UDP, you do have some guarantees with TCP, including that data received on the socket will be received in the same order in which it was sent.

So as long as you don't call BeginReceive() until you are completely done processing the current completed I/O operation, everything's fine. Your code sees the data in the right order. But if you call BeginReceive() earlier, then your current thread could get pre-empted before it's done processing the current I/O operation, and a different thread could wind up processing a newly-completed I/O operation.

Unless you have done some kind of synchronization or ordering of the received data to account for the possibility of handling I/O completions out-of-order, this will corrupt your data. Not good.

There are good reasons to issue multiple receive operations concurrently. But they generally have to do with a need for a highly scalable server. There are also negatives associated with issuing multiple concurrent receive operations, including the added complexity of ensuring the data is processed in the right order, as well as the overhead of having multiple fixed/pinned buffers in your heap (though that can be mitigated in a variety of ways, such as allocating buffers large enough to ensure they are in the large-object heap).

I would avoid implementing the code in this way, unless you have a specific performance issue you have to solve. Even when dealing with UDP, but especially when dealing with TCP. And if you do implement the code this way, do it with great care.

Does the single UdpClient object need to be declared static (and static lock object if one is required)?

Where you store the reference to your UdpClient object does not matter one bit. If you have code that needs to maintain more than one UdpClient at a time, storing a reference in a single UdpClient-type field would not even be very convenient.

All that making something static does is change how that member is accessed. If not static, you need to specify the instance reference in which the member is found; if it is static, you just need to specify the type. That's all. It doesn't have anything at all to do with thread-safety per se.


Finally, with respect to your two code examples, they are functionally equivalent. There is no need to protect the calls to EndReceive() and BeginReceive(), and your lock doesn't encompass any other part of those methods (e.g. the actual processing of the datagram), so it's not really accomplishing anything (other than to possible increase the overhead of context switches).

In the concurrent scenario, it's possible that the first thread would be pre-empted before leaving the lock but after calling BeginReceive(). This could lead to a second thread being awoken to process the callback for a second I/O completion. That second thread would then hit the lock and stall, allowing the first thread to resume execution and leave the lock. But all that that synchronization does is to slow things down. It doesn't prevent any concurrent access of the datagram data itself, which is the (possibly) important part.

Peter Duniho
  • 68,759
  • 7
  • 102
  • 136
  • Thank you @Peter, very comprehensive answer and discussion. That has done me a lot of good. If I understand your point correctly, unless my performance requirements are high, it seems that you would recommend to move `processDatagram()` after the `EndReceive()` call and before the subsequent `BeginReceive()` call, for all protocols. And for my particular application, this will be fine. As for the datagram processing itself, I'm currently queueing and locking the queue, and this is handled elsewhere. – khargoosh Oct 14 '15 at 05:18
  • To Clarify: As for the datagram processing itself, I'm currently queueing this data, and locking the queue for read and write access, and this is handled elsewhere. The data itself is shared and thread-safe. – khargoosh Oct 14 '15 at 05:35
  • Yes, if you do not have a specific, observed, quantifiable performance issue to solve, I would ensure that your call to `processDatagram()` is executed before your next call to `BeginReceive()`. This will keep things simple. – Peter Duniho Oct 14 '15 at 05:46
  • I'm not 100% sure about this, so if I am wrong please correct me: The Asynchronous Pattern doesn't call the handlers on a different thread. All this locking business is relevant when there are multiple threads, but the sockets invoke the completion handlers on the same thread as the `BeginReceive`-call. Therefore, the only issue you have to handle is that you don't know the order in which the callbacks are called, but all these calls are synchronous! – MicroVirus Oct 14 '15 at 09:13
  • They might 'overlap' (multiple callbacks being active at the same time) in certain situations, but then they are still handled in a single-threaded manner, just like when handling one Event of a form can cause subsequent Events to be fired that are then handled before the first Event completes. So, either I missed something, or this talk about threads is not relevant here. The gist of your answer, the order at which messages are received, that's the only part that @khargoosh needs to worry about. – MicroVirus Oct 14 '15 at 09:13
  • 1
    @MicroVirus: _"The Asynchronous Pattern doesn't call the handlers on a different thread"_ -- you are _definitely_ incorrect about that. Throughout .NET, the async patterns almost always use different threads for completions. Obvious exceptions being `Control.BeginInvoke()` and `Dispatcher.BeginInvoke()` (they are specifically designed to work on a specific thread). Network I/O in particular is handled by the IOCP thread pool, and arbitrary threads are used to handle I/O completions. – Peter Duniho Oct 14 '15 at 15:55
  • 1
    @MicroVirus: because a thread pool is used, if load is low you may find the pool has just one thread and that can make it look like you always get the completion on the same thread. But in reality, all that's going on there is that there isn't enough activity to get the thread pool to create new threads. Under load, it will, and those extra threads are used to handle I/O completions as they happen, without regard to where the I/O was initiated. – Peter Duniho Oct 14 '15 at 15:57
  • Thanks for your response, this cleared up some things for me. I hope my earlier comments don't confuse readers. – MicroVirus Oct 14 '15 at 16:38
  • @MicroVirus: happy to help...if you are concerned about confusion, you do have the option of deleting your comments. Doing so should be weighed against the possibility that any misconceptions you might have had, someone else might have them as well and could benefit from your comments and my answers. But I don't think much harm would be done if you did want to delete the comments. – Peter Duniho Oct 14 '15 at 16:41