4

I'm writing code that will be publishing messages from multiple threads to an Azure Event Hub in C# using the EventHubClient. The documentation for EventHubClient contains the fairly standard boiler plate.

"Any public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe."

There is no additional documentation as to thread safety in any of the four send methods I would most expect to be thread safe. Were I to believe that the send methods are not threadsafe then I would end up creating a new EventHubClient instance each time I wished to send to a message. Since the underlying tcp connection is apparently reused unless steps are taken this may not have too much overhead. Similar issues arise with partitioned senders though given that there is an async method to create one, they might well have their own AMQP connection.

Are some, if not all, instance methods of EventHubClient thread safe despite the documentation?

And for any Azure folks would it be possible to have this clarified in the documentation? This sort of documentation issue (assuming it is wrong as seems likely) appears to affect Azure Table as well and is generally common within the MSDN docs. With regards to EventHub this is in contrast to the clear thread safety statement of Kafka and AWS Kinesis at least does not explicitly label everything as unsafe. I did not find EventHubs in the open source portion of the SDK so could not check myself.

Community
  • 1
  • 1
cacsar
  • 2,098
  • 1
  • 14
  • 27

1 Answers1

7

TLDR:

  1. All critical runtime operations (aka data-plane) in the .NET SDK are thread-safe.
  2. Create EventHubClient object once and re-use

The Story

ServiceBus SDK exposes two patterns to create senders:

  1. Basic
  2. Advanced

For Basic version - developer will directly use EventHubClient.CreateFromConnectionString() API and doesn't worry about managing MessagingFactory objects (connection gu's). SDK will handle reusing the MessagingFactory across all EventHubClient instances as long as the connection string is same - a literal match of all keys and values - is done in the SDK for this reuse.

For an Advanced developer who will need a bit more control at connection level, SB SDK provides MessagingFactory.CreateFromConnectionString() and from this developer can create the EventHubClient instance.

All instance methods of EventHubClient - to send to EventHubs are strictly thread-safe. In general, all data-plane operations are... However, while reading from EventHubs, API is optimized for, this pattern. while(true) { var events = eventHubPartitionReceiver.receive(100); processMyEvents(events); } So, for ex: properties like, EventHubReceiver.RuntimeInformation - is populated after every receive call without any synchronization. So, even though the actual receive API is thread-safe - the subsequent call to RuntimeInformation isn't - as it is rare for anyone to park multiple receive calls on an instance of PartitionReceiver.

Creating a new instance of EventHubClient in each component to start send messages is the default pattern - and the ServiceBus SDK will take care of reusing the underlying MessagingFactory - which reuses the same physical socket (if the connection string is same).

If you are looking for real high throughput scenarios then you should design a strategy to create multiple MessagingFactory objects and then Create an EventHubClient each. However - make sure that you have already increased the Thruput units for your EventHub on the Portal before trying this as the default is just 1 MBPS - cumulative of all 16 partitions.

Also, if the Send pattern you are using is Partitioned Senders - they all will also use the same underlying MessagingFactory - if you create all Senders from the same eventHubClient(.CreatePartitionedSender()) instance.

Sreeram Garlapati
  • 4,877
  • 17
  • 33
  • 1
    Could you provide more detail on the Partitioned Sender case? Can it create an AMQP connection directly to the partition or is its sole purpose to override the hash based partitioner? If the latter is there a reason for it to be an async call? I wouldn't expect the library to be totally threadsafe, but if Send is currently threadsafe and is built on top of the threadsafe Messaging factory, is it just marked as non threadsafe to keep future options open somehow? – cacsar Nov 18 '14 at 01:13
  • Great question. PartitionedSender doesn't create a direct amqp connection to partition. Every ServiceBus Sender will first talk to a gateway svc and resolve the backend role it talks to. The only advantage/optimization of partitioned sender that I would imagine is, gateway wouldn't need to break open the amqp message (msg annotations --> deserialization cost -- if it was done right) and extract partition-key and compute hash to find the right partition. Async - to keep it extensible (for future optimizations to find the partition at client and fwd) - which the server already has 4 Topics case. – Sreeram Garlapati Nov 22 '14 at 00:18
  • 1
    I'll have to say that is a decent enough answer. When it comes time to revise the producer API or provide a friendlier one, I'd ask that you look at the Kafka >0.8.2 producer to see if their paradigm is suitable. Needing to (vs having the option to) handle batching myself has led to a few frustrated comments in my code. – cacsar Nov 26 '14 at 03:25