0

I am given X RabbitMQ queues. Some of the queues contain duplicate messages (message is stored in queue A as well as in queue B for example).

I am trying to achiveve one thing: process all the messages from "input" queues (I made a consumer that connects to these queues), remove duplicate messages on the go and send the result data to one output queue.

What would be the fastest and most efficient way to do this?

As far as I know AMQP message_id property is optional, so I have to implement some kind of comparing "seen" messages to the newly arrived ones to achieve my goal.

Hashing message bodies came to my mind, but as I am relatively new to algorithms I am not sure which function to use and what to focus on.

2 Answers2

1

I ended up hashing the message body using SHA1 and storing hash of seen messages. Messages that have not been seen are forwarded to result queue, already seen are discarded.

0

You can convert both the messages into JSON if possible and compare them. One of the post which I came over for Json comparison How to compare two JSON objects with the same elements in a different order equal?