2

I code all my micro-service in java. I want use Multiple Consumers with Amazon SQS but each consumer has multiple instances on AWS behind a load balancer.

I use SNS for input stream

I use SQS Standard Queue after SNS.

SQS Standard Queue after SNS

I find the same question on stackoverflow (Using Amazon SQS with multiple consumers)

This sample is

https://aws.amazon.com/fr/blogs/aws/queues-and-notifications-now-best-friends/

queues-and-notifications-now-best-friends

When I read SQS Standard Queue documentation, I see that occasionally more than one copy of a message is delivered.:

SQS Standard Queue documentation

Each message has a message_id. How to detect that there are not multiple instances of a same micro-service processes the same message that would have been sent multiple times. I got an idea by registering the message_id in a dynamodb database but if this is done by several instances of the same micro-service, how to make a lock on the get (a bit like a SELECT FOR UPDATE)?

for example multiple instances of a same micro-service "Scan Metadata".

Stéphane GRILLON
  • 11,140
  • 10
  • 85
  • 154

1 Answers1

2

As you have mentioned, standard SQS queues can sometimes deliver the same message more than once. This is due to the distribute nature of SQS service. Each message is stored on multiple servers for redundancy and there is a change that one of those servers is down when you are calling sqs:DeleteMessage, therefore the message will not be deleted from all of the servers and once the failed server comes back online, it doesn't know that the you have deleted the message and it will be processed again.

Easiest way to solve the issue with duplicate messages is to switch to using FIFO queue which provides you with exactly once processing. You can choose to use deduplication based on either content or unique ID generated by sender. If you choose to use content deduplication, when queue receives two messages with the same content in 5 min. deduplication interval, the message will be discarded.

If two messages can have the same content yet you need to treat them as different messages, you can use deduplication based on ID that you can pass to sqs:SendMessage or sqs:SendMessageBatch calls via MessageDeduplicationId argument.

I would definitely check FIFO queues before thinking about using DynamoDB to store the state of message processing. It will be cheaper and this deduplication functionality is provided for you by default without you having to implement any complex logic.

Matus Dubrava
  • 13,637
  • 2
  • 38
  • 54
  • Also, depending on the use case, the occasional message duplication can actually be fine. If, for example, an image is resized twice, it wastes a little bit of processing power, but this may be cheaper than deduplicating. – Norwæ Jul 24 '19 at 09:51
  • 1
    **Important: Amazon SNS isn't currently compatible with FIFO queues.** https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-subscribe-queue-sns-topic.html – Stéphane GRILLON Jul 24 '19 at 11:02
  • I wasn't aware of this limitation. In such case, I would still use FIFO queue but instead of using SNS to fan out those messages, I would probably do it in application - calling multiple `sqs:SendMessages`, one for each SQS endpoint. – Matus Dubrava Jul 24 '19 at 11:07
  • 1
    @MatusDubrava, If I still use the FIFO queue, but instead of using SNS to deploy these messages, I would probably do it in the application by calling multiple sqs, there would be a strong link between my micro services. **The purpose of using queues was precisely to remove the strong coupling between micro-service.** – Stéphane GRILLON Jul 24 '19 at 11:14
  • Look, I don't know of any direct solution, I am just discussing options. You could technically still use FIFO queue and achieve the same decoupling by sending those messages to just one FIFO queue, then have EC2 running a script that will read the messages from it and fan them out to multiple SQS FIFO queues. So basically you would replace SNS with custom fan out solution. – Matus Dubrava Jul 24 '19 at 11:23
  • *The purpose of using queues was precisely to remove the strong coupling between micro-service.* -- btw, those queues are still in use, you are just sending messages to multiple queues instead of just one so the architecture is still decoupled. Just not in the same way you can achieve with SNS fan out. – Matus Dubrava Jul 24 '19 at 11:29
  • **“SQS: Note that FIFO queues are not currently supported.** https://aws.amazon.com/sns/faqs/?nc1=h_ls – Stéphane GRILLON Jul 24 '19 at 11:29
  • I got that. Neither of those two previously mentioned approaches uses SNS. – Matus Dubrava Jul 24 '19 at 11:31
  • 1
    I write *"I use SQS Standard Queue after SNS"* and on the two previous schemas it is clear that it is SNS as entry point – Stéphane GRILLON Jul 24 '19 at 11:35
  • I am not talking about those schemas, but about proposed approaches. First is: Application -> multiple FIFO queues. Second is: Application -> SQS queue -> custom EC2 fan out solution -> multiple FIFO queues. Neither of them uses SNS. Both of them are using queues for decoupling. – Matus Dubrava Jul 24 '19 at 11:38
  • As you have posted 2 times already, you cannot use SNS with FIFO queues so you can either design you own deduplication logic or you can take the route that I have proposed which bypasses SNS. – Matus Dubrava Jul 24 '19 at 11:42
  • You could also write your own lambda function to transfer messages from SNS to FIFO queues. – Matthew Pope Jul 24 '19 at 17:00