21

I understand that Azure Service Bus has a duplicate message detection feature which will remove messages it believes are duplicates of other messages. I'd like to use this feature to help protect against some duplicate delivery.

What I'm curious about is how the service determines two messages are actually duplicates:

  • What properties of the message are considered?
  • Is the content of the message considered?
  • If I send two messages with the same content, but different message properties, are they considered duplicates?
Paul Turner
  • 38,949
  • 15
  • 102
  • 166

3 Answers3

33

The duplicate detection is looking at the MessageId property of the brokered message. So, if you set the message Id to something that should be unique per message coming in the duplicate detection can catch it. As far as I know only the message Id is used for detection. The contents of the message are NOT looked at, so if you have two messages sent that have the same actual content, but have different message IDs they will not be detected as duplicate.

References:

MSDN Documentation: https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-queues-topics-subscriptions

If the scenario cannot tolerate duplicate processing, then additional logic is required in the application to detect duplicates which can be achieved based upon the MessageId property of the message which will remain constant across delivery attempts. This is known as Exactly Once processing.

There is also a Brokered Message Duplication Detection code sample on WindowsAzure.com that should be exactly what you are looking for as far as proving it out.

I also quickly tested this out and sent in 5 messages to a queue with RequiresDuplicateDetection set to true, all with the exact same content but different MessageIds. I then retrieved all five messages. I then did the reverse where I had matching MessageIds but different payloads, and only one message was retrieved.

A.Rowan
  • 1,460
  • 2
  • 16
  • 20
MikeWo
  • 10,887
  • 1
  • 38
  • 44
  • 1
    Do you have any references that would support your answer? I struggled to find anything concrete - a code example would suffice. – Paul Turner Nov 15 '13 at 17:59
  • 1
    Sure: I'll add detail to the answer. – MikeWo Nov 15 '13 at 18:11
  • 1
    When detecting duplicates, does the queue keep the oldest message or the newest message? – CodeGrue Jul 21 '14 at 20:44
  • 1
    @Codegrue It simply ignores the second, or subsequent messages. Only the first is kept. Depending on the length of the window of the duplicate message history and the length of the queue, indeed the first message may have already been processed. – MikeWo Aug 03 '14 at 15:50
  • @MikeWo What about if I want to check duplication using `message text` not by `message Id` ? can any one provide me link sample for the same ? – Neo Feb 25 '15 at 06:26
  • The Duplicate Message Detection feature on Service Bus does not provide a mechanism to check based on message text. You'd have to deal with that on your own. – MikeWo Feb 25 '15 at 14:04
  • 1
    Make a hash of the message text and assign it as message id. – Zygimantas Apr 08 '15 at 14:11
  • 1
    If readers follow the advice to use hash codes, it could lead to serious bugs. Different messages can and do produce the same hash code and therefore, non duplicate messages will be lost – Daniel Dyson Jan 02 '16 at 20:21
  • This was a sneaky one for me. All of a sudden our retry logic wasn't working and it seemed like there weren't any errors - it was if Azure wasn't sending them to us. Turns out, they were ignoring them! – Cody Sep 07 '16 at 15:12
  • So does anyone know if this works when a message is scheduled in the future? It's not working in our implementation and I'm trying to work out why. – Tom Mar 08 '17 at 10:43
  • Does anyone know how to do this in Java using the JMS API? I've run into the same issue as the unanswered question [here](https://stackoverflow.com/questions/63314419/azureservicebus-how-to-set-messageid-using-jms-in-azure-service-bus). I also tried doing `message.setStringProperty("MessageId", dupAwareMessageId);` without any success. – HeatZync May 18 '22 at 14:40
1

In my case I have to apply ScheduledEnqueueTimeUtc on top of MessageId. Because most of the time the first message already got pickup by worker, before the sub-sequence duplicate message were arrive in the Queue. By adding ScheduledEnqueueTimeUtc. We tell the Service bus to hold on the the message for some time before letting worker them up.

            var message = new BrokeredMessage(json)
            {
                MessageId = GetMessageId(input, extra)
            };

            // Delay 30 seconds for Message to process
            // So that Duplication Detection Engine has enought time to reject duplicated message
            message.ScheduledEnqueueTimeUtc = DateTime.UtcNow.AddSeconds(30);
0

Another important property to be considered while dealing with 'RequiresDuplicateDetection' property of a Azure Service Bus entity is 'DuplicateDetectionHistoryTimeWindow', the time frame within which message with duplicate message id will be rejected.

Default value of duplicate detection time history now is 30 seconds, the value can range between 20 seconds and 7 days.

Enabling duplicate detection helps keep track of the application-controlled MessageId of all messages sent into a queue or topic during a specified time window. If any new message is sent carrying a MessageId that has already been logged during the time window, the message is reported as accepted (the send operation succeeds), but the newly sent message is instantly ignored and dropped. No other parts of the message other than the MessageId are considered.

Ezhilarasi
  • 272
  • 2
  • 9