32

So the scenario is that I'm using an SB queue to throttle outgoing callbacks to other services. One of the standard problems with calling back to other services is that they may be down for uncontrollable amounts of time. Assuming I detect that the target is down/not responding, what is the best pattern for abandoning that message so that it doesn't reappear on the queue immediately?

Here's are some approaches I'm either aware of, have tried or am considering:

  • Obviously if I just use BrokeredMessage::Abandon() the message will be unlocked and put back on the queue. This is obviously undesirable for this scenario and what I'm trying to avoid.

  • If I just ignore the fact that I ran into an error and never call Abandon this will keep it from showing up immediately, but I don't really have fine grained control over how long until it shows up again and I would like to implement a decaying retry strategy.

  • I thought maybe I could call BrokeredMessage::Abandon(IDictionary<string, object>) and somehow update the ScheduledEnqueueTimeUTC property, but I have tried this and there doesn't seem to be a way to affect that property beyond the initial sending of the message. Makes sense, but thought worth a try.

  • I have considered just using BrokeredMessage::Complete() in this situation and actually just enqueueing a new copy of the message with the ScheduledEqueueTimeUTC property set.

The final bullet almost seems too heavy handed, but I'm coming to the conclusion it's probably the right answer given the inherent nature of queues. I just figured there might be a nicer way to do this within Azure SB queues that I'm missing.

Drew Marsh
  • 33,111
  • 3
  • 82
  • 100

6 Answers6

6

If you want to put a message away for a while and you have a place to write down the SequenceNumber (which might be in Session state in a sessionful queue), you can Defer() it. Deferred messages can be retrieved using a special Receive overload giving the SequenceNumber; that's also the only way to get at them again other than them expiring, so careful. This feature was built for workflows and state machines so that they can deal with out of order message arrival.

Clemens Vasters
  • 2,666
  • 16
  • 28
  • 7
    Ok, so, tell me if this sounds right: we can Defer() the original message on the primary queue and put a "retry" message on secondary queue (probably using ScheduledEnqueueTimeUTC on that) which then has the SequenceNumber of the original message which I can then go back and grab via the special Receive() overload. – Drew Marsh Apr 30 '13 at 16:16
  • 3
    I'm confused what the secondary queue adds that isn't accomplished by your "heavy handed" solution of copying the message with a scheduled delivery time. The only problem I saw implementing that is that the delivery count is always reset to 1 so it doesn't make a good source to determine the input to the exponential backoff calculation. – Christopher Elliott Dec 30 '16 at 17:04
  • You cannot undefer a message, so it's pretty useless. If you defer it, then receive it by sequence number, abandon doesn't work. You're forced to complete the message. So you can't use deferral to set messages aside so you can selectively reprocess other messages with a normal, multi-threaded receiver. Once deferred, they're either permanently deferred or you have to 'complete' the message and lose it permanently. Terrible design. – Triynko May 25 '22 at 15:39
  • what if we don't want to go through the overhead of persisting the SequenceNumber. Exponential backoff or delayed retry is a pretty standard industry practice, have we implemented anything recently that saves us from having to copy the message, set retry headers manually. – Sameer Oct 26 '22 at 14:24
1

In bullet #2 above: Can't you just set the TTL timespan on the message when calling Receive(TimeSpan) instead of the default Receive()? Then, you can simply abandon the message (without calling Abandon()), and the message should reappear in the queue when the TTL expires. While this doesn't give you fine-grain control to say "Reappear after x seconds," it does give you predictability for when the message does reappear.

Note: With Storage-based queues, you can update the invisibility timeout, so that would give you fine-grain control for message re-appearance.

David Makogon
  • 69,407
  • 21
  • 141
  • 189
  • 1
    David, thanks for the suggestion. The problem with that is that we want to use an intelligent, decaying retry. So, for a simple example, if it's the first error maybe I just want to throw it back on the queue for 30s from now. Second 1m, third 5m, fourth 30m, fifth 1hr, etc. ad nauseam. To do that, we'd need to know how many times it had previously been received which we could track through DeliveryCount (or other custom prop), but if we haven't done a Receive() yet, we wouldn't have the property data to tell how long to set the TTL for. – Drew Marsh Apr 30 '13 at 15:50
1

Feels a bit hacky, but the solution I came up with is

try 
{
   ...
}
catch (Exception ex)
{
   await Task.Delay(30000);
   throw;
}

This way it will wait for 30 seconds before allowing it to abandon. It will eventually dead letter after the configured amount of times.

I am using Azure Webjobs for receiving. Although I am using Task.Delay instead of Thread.Sleep it doesn't seem to be freeing up the thread to process another item from the queue while it awaits (by default, Webjobs processes 16 in parallel).

0

If I were you I would consult one of the many Enterprise integration patterns pages that are around on the internet for a solution. Basically you want to have a retry which if it fails successively sends the message to a dead letter queue. These messages can then we requeued at a later date. This could be manual or automated depending on requirements.

Please note that while the page I have sent you is related to camel and therefore java everything described on that page is applicable .NET and azure. Here is a more .NET one if your interested http://www.eaipatterns.com/

Alistair
  • 1,064
  • 8
  • 17
  • 1
    Alistair, thanks for the suggestion. I'll check the links out for sure. We were hoping to avoid separate queues/separate processing for this situation. We ultimately do dead-letter the messages after a certain number of processing attempts, we were just looking for a simpler way to spread out the retries until we reached the max. In the end I would think that requeueing the message to the same queue with a new ScheduledEnqueueTimeUTC value would probably buy us what we want in the SB case, I was just kinda hoping there was a cleaner way to do this in the API other than resending the message. – Drew Marsh Apr 30 '13 at 16:04
0

I would prefer the last approach because it seems to be the most simple solution using the built in features of the Azure service bus.

The flow is this:

var newMessage = new BrokeredMessage();
// Copy message body and properties from original message...

var scheduleTimeUtc = DateTimeOffset.UtcNow.Add(...);
await queueClient.ScheduleMessageAsync(newMessage, scheduleTimeUtc);

await originalMessage.CompleteAsync()
pberggreen
  • 928
  • 6
  • 13
0

There is one issue with the message re-queuing solution (though that seems to be the best solution so far). This won't work efficiently in Topic/Multiple-Subscriber model as the new message will get delivered to the other subscribers as well even if they successfully processed it.

For tracking the original Message Id and the proper Delivery count , the Message.UserProperties can be used in the new message.

CCRider
  • 1
  • 1