1

I have a WebApi2 controller that receives XmlHttpRequests from JavaScript.

I have +500 calls to the api per second, and any request perform some quick calculations, then I create an Azure Storage Queue (not the service bus one) passing in a serialized object for later processing. Until here all works, the problem is that 10-15% of the time, just initializing the Storage queue and adding a 20k JSON message takes something between 500ms to 2 seconds. I sharded the requests into 10 different queues but the problem remains and seems not to be related to amount of traffic, basically sometimes the queues just get sort of stuck into the creation and slow down.

I disabled Nagle and Expect100Continue already.

I thought to convert this architecture in using EventHUbs since probably my situation is requiring an ingestor of events more than a simple Queue, requiring max speed.

But the inizialization of the EventHub has the same exact problem! It takes sometimes 2 or 3 seconds to start and receive a single message, with an average of 400ms.

I measured the speed with a stopwatch.

This is my code in the API Controller:

  var eventHubClient = StorageHelpers.InitializeEventHub("name", "Send");
                           await eventHubClient.SendAsync(new EventData(Encoding.UTF8.GetBytes(QueueSerialized)));

Where InizializeEventHub is:

        public static EventHubClient InitializeEventHub(string eventHubName, string type)
    {
        string connectionString = RoleEnvironment.GetConfigurationSettingValue("Hub"+type+eventHubName);
        return EventHubClient.CreateFromConnectionString(connectionString, eventHubName);}

The service is hosted on azure using a cloud service, hosted in the same place (WestUS) of the ServiceBus and storages.

My questions are:

  • 1)Is this amount of time normal to inizialize the connection?
  • 2)Is there a way for Web Api to share the same EventHubClient instance for all calls? Something like is done with Redis using ConnectionMultiplexer in a Lazy class.
  • 3) May I cache the EventHubClient Object?

Any help on this matter would be really appreciated, I can even return on the Storage Queue if there is some way to speed the initialization and the AddMessageAsync operation.

Thanks

Francesco Cristallo
  • 2,856
  • 5
  • 28
  • 57

3 Answers3

1

Great Qstn! Here's my take:

  1. On one of the Azure's most-very busy scaleunits (like west us) - order of 400 ms. does sound a likely number for eventhubs send latency. What is the average latency you are looking for ? The first call taking 2-3 sec accounts for creating a connection & especially for SSL negotiation. These doesn't vary significantly among various azure services in this region. Only the first few calls will take this time. All subsequent calls should be in the order of millis. The EventHubClient.Send API (there are 3 types of sends - and you are using this 1), which is designed for HighAvailability, first sends the message to a ServiceBus Gateway which is highly-available, which then forwards to One of the available EventHub partitions - making it highly-available for Send operations. This does add a minor initialization cost for the Gateway to discover partition on the first send. Lets say, if your number of partitions are 4, your first 4 Send calls to that EventHub might take a bit higher latency - & from them on - it is highly performant.
  2. As long as the EventHub you are talking to - is same - you can share the EventHubClient in the WebAPI. Every EventHubClient is associated with a Connection. However, in the EventHub .net SDK, as long as the Connection String of 2 EventHubClients are same - the connection will be re-used. One optimization here - if you have less traffic and have a fan-out architecture by having more event-hubs: i.e., if your scenario have multiple eventhubs & all of your EventHubs are in a Single Namespace and want to use 1 EventHubClient object (which means just 1 socket per webapi process) to Send to EventHubs Service, you can use MessagingFactory (with Namespace level SasKey) to create EventHubClient.

var msgFactory = MessagingFactory.CreateFromConnectionString(@"Endpoint=amqps://---namespaceName----.servicebus.windows.net;SharedAccessKeyName=---SasKeyName----;SharedAccessKey=----SasKey----"); var ehClient = msgFactory.CreateEventHubClient("----eventHubName----");

  1. You could consider caching the EventHubClient object. It could save few lines of client code execution to fetch a MessagingFactory (which holds reference to the Connection) from cache.

HTH! Sree

Community
  • 1
  • 1
Sreeram Garlapati
  • 4,877
  • 17
  • 33
0
  1. Not sure, I never bothered to time it since if you reuse it, it's not as important as it would otherwise be. It seems excessively long given that the network connection gets reused
  2. Yes.
  3. It depends on what you mean by cache. If you mean serialize and save in memory somewhere then no. If you mean put in a ConcurrentBag (using it like a pool), then definitely.

If you're making >500 requests per server per second each 20KB, then you should confirm that you've set enough throughput units since that's >10MB/second inflow which takes at least 10 throughput units. Throttling could explain latency problems. Another thing to examine is what components of initialization are taking time, for example I've never benchmarked the GetConfigurationSettingValue, and it might not be cached.

But assuming none of that is the problem the question is what do you need to do to make it fast? You can certainly reuse either the EventHubClient or an object of your own creation to deal with creation time. Not being too connected to WebAPI the simple way is simply to have a static variable that contains an instance (perhaps with constructor initialization inside of a Lazy). When reusing this you should know that the EventHubClient is not officially threadsafe (though Send appears to be in reality) which means you'll need to manage it. But a single EventHubClient or multiple ones sharing the same network connection may not work out for you with 10MB/s per server. In that case I direct your attention to this portion of the documentation:

Finally, it is also possible to create an EventHubClient object from a MessagingFactory instance, as shown in the following example.

var factory = MessagingFactory.CreateFromConnectionString("your_connection_string"); var client = factory.CreateEventHubClient("MyEventHub");

It is important to note that additional EventHubClient objects created from a messaging factory instance will reuse the same underlying TCP connection. Therefore, these objects have a client-side limit on throughput. The Create method reuses a single messaging factory. If you need very high throughput from a single sender, then you can create multiple message factories and one EventHubClient object from each messaging factory.

And if you're doing that, then I highly recommend pooling them/writing your own multiplexer.

Community
  • 1
  • 1
cacsar
  • 2,098
  • 1
  • 14
  • 27
0

I ended up with a crazy easy solution. Both EventHubs and StorageQueues needs time to initialize and EventHubs in particular very often is slow in adding messages to the stream. Now 300ms is not slow in 99.99% of cases but in my case it is.

StorageQueue is super cheap, fast and easy but slow as hell adding messages. After hours of benchmarks, and other solutions checks like Redis Pub/Sub, I ended up using StorageQueues, simply not awaiting the Async call.

so the standard call is

await queue.AddMessageAsync(message);

and the await part is the problem, WebApi cannot return if the task is not back. Should be a Fire and Forget but it is not.

i resolved the matter not awaiting the call, hiding the warning using a variable

var nowait = queue.AddMessageAsync(message);

The insert in the queue is -immediate- in any situation, and no messages are lost.

Francesco Cristallo
  • 2,856
  • 5
  • 28
  • 57
  • 1
    Firing and forgetting would certainly cause things to return quicker. Don't forget to set your process not to crash if unobserved exceptions are garbage collected. – cacsar Feb 09 '16 at 11:04