1

I'm trying to design a robust architecture, however I'm having trouble on solving the message delivery. Let me try to explain

The API would be clustered on ECS receiving a bunch of requests.

The Workers would be clustered too subscribing the same channels. (that's the problem, if we were working with only one worker it wouldn't have any issue)

How to deal with multiple workers avoiding duplicated messages? What would be a good simple approach, keeping many workers occupied.?

Thank you.

enter image description here

  • Usually the message broker addresses this; in RabbitMQ, for example, if you have multiple consumers listening to a queue, they'll all receive messages but they'll usually process each message only once (unless there's a failure of some sort). – David Maze Jan 19 '21 at 11:42
  • Thank you David, I found this article interesting http://www.steves-internet-guide.com/mqttv5-shared-subscriptions/ .. I didn't know about shared subscriptions. I'm not sure yet if shared subscription is what I was looking for. – Jorge Solano Jan 19 '21 at 19:59

2 Answers2

0

This sounds like a very fundamental problem, for a message broker: having one channel and multiple workers subscribed to it, and all of them to receive the same message. It wouldn't really be useful to process the same message multiple times.

This problem has been addressed in most message brokers (I believe). For example, when you consume a message from an Amazon SQS queue, that message is not visible to other consumers, for a particular timeframe (visibility timeout).

When the worker processed the message, it has to delete it from the queue. Otherwise, if the timeout expired, other workers will see the message and process it.

SQS in particular has a distributed architecture and sometimes you get duplicate messages in the queue, which are processed by different workers. That's the effect of the at-least-once delivery guarantee that SQS provides.

If your system has to be strict about duplicate messages, then you need to build a de-duplication mechanism around it.

Cosmin Ioniță
  • 3,598
  • 4
  • 23
  • 48
0

The keywords you are looking for is "exactly once guarantee in a distributed system". With that you can do some research on your own, but here some pointers.

You could use the right Event Queue System that supports "exactly once" guarantees. For example Apache Pulsar (see this link) or Kafka, or you can use their approach as inspiration in your own implementation (which may be somewhat hard to do).

For your own implementation you could write a special consumer that is the only consumer and acts a distributor for worker tasks and whose task it is to guarantee "exactly once". It would be a tradeoff and could prove a bottleneck, depending on your scalability requirements. This article explains why it is a difficult problem in distributed systems.

Oswin Noetzelmann
  • 9,166
  • 1
  • 33
  • 46