86

I need to choose a new Queue broker for my new project.

This time I need a scalable queue that supports pub/sub, and keeping message ordering is a must.

I read Alexis comment: He writes:

"Indeed, we think RabbitMQ provides stronger ordering than Kafka"

I read the message ordering section in rabbitmq docs:

"Messages can be returned to the queue using AMQP methods that feature a requeue parameter (basic.recover, basic.reject and basic.nack), or due to a channel closing while holding unacknowledged messages...With release 2.7.0 and later it is still possible for individual consumers to observe messages out of order if the queue has multiple subscribers. This is due to the actions of other subscribers who may requeue messages. From the perspective of the queue the messages are always held in the publication order."

If I need to handle messages by their order, I can only use rabbitMQ with an exclusive queue to each consumer?

Is RabbitMQ still considered a good solution for ordered message queuing?

durrrr
  • 352
  • 2
  • 15
Bick
  • 17,833
  • 52
  • 146
  • 251

4 Answers4

191

Well, let's take a closer look at the scenario you are describing above. I think it's important to paste the documentation immediately prior to the snippet in your question to provide context:

Section 4.7 of the AMQP 0-9-1 core specification explains the conditions under which ordering is guaranteed: messages published in one channel, passing through one exchange and one queue and one outgoing channel will be received in the same order that they were sent. RabbitMQ offers stronger guarantees since release 2.7.0.

Messages can be returned to the queue using AMQP methods that feature a requeue parameter (basic.recover, basic.reject and basic.nack), or due to a channel closing while holding unacknowledged messages. Any of these scenarios caused messages to be requeued at the back of the queue for RabbitMQ releases earlier than 2.7.0. From RabbitMQ release 2.7.0, messages are always held in the queue in publication order, even in the presence of requeueing or channel closure. (emphasis added)

So, it is clear that RabbitMQ, from 2.7.0 onward, is making a rather drastic improvement over the original AMQP specification with regard to message ordering.

With multiple (parallel) consumers, order of processing cannot be guaranteed.
The third paragraph (pasted in the question) goes on to give a disclaimer, which I will paraphrase: "if you have multiple processors in the queue, there is no longer a guarantee that messages will be processed in order." All they are saying here is that RabbitMQ cannot defy the laws of mathematics.

Consider a line of customers at a bank. This particular bank prides itself on helping customers in the order they came into the bank. Customers line up in a queue, and are served by the next of 3 available tellers.

This morning, it so happened that all three tellers became available at the same time, and the next 3 customers approached. Suddenly, the first of the three tellers became violently ill, and could not finish serving the first customer in the line. By the time this happened, teller 2 had finished with customer 2 and teller 3 had already begun to serve customer 3.

Now, one of two things can happen. (1) The first customer in line can go back to the head of the line or (2) the first customer can pre-empt the third customer, causing that teller to stop working on the third customer and start working on the first. This type of pre-emption logic is not supported by RabbitMQ, nor any other message broker that I'm aware of. In either case, the first customer actually does not end up getting helped first - the second customer does, being lucky enough to get a good, fast teller off the bat. The only way to guarantee customers are helped in order is to have one teller helping customers one at a time, which will cause major customer service issues for the bank.

It is not possible to ensure that messages get handled in order in every possible case, given that you have multiple consumers. It doesn't matter if you have multiple queues, multiple exclusive consumers, different brokers, etc. - there is no way to guarantee a priori that messages are answered in order with multiple consumers. But RabbitMQ will make a best-effort.

starball
  • 20,030
  • 7
  • 43
  • 238
theMayer
  • 15,456
  • 7
  • 58
  • 90
  • 3
    Is there a way to configure rabbit to requeue the messages at the end of the queue instead of the front? – Ryan Walls Mar 27 '14 at 06:15
  • Probably, but what are you trying to achieve and what is the importance of this? – theMayer Apr 24 '14 at 10:09
  • Thanks for pointing out the differences between RabbitMQ < 2.7.0 and >= 2.7.0, that saved my day :-) – Golo Roden Dec 17 '14 at 08:28
  • 7
    @Ryan: No you cannot. But there's a workaround: you can clone the message and publish it into the same queue, like a completely new message, then it will go to the end of the queue. In this case the attribute `redelivered` of the message will be `false` instead of `true` like a normal requeue. – Kien Nguyen Feb 05 '15 at 18:14
  • 3
    Kafka allows for parallelization with an app level defined partial order by the way of partitions, which is very practical for real tasks. RabbitMQ appears to either offer global order with no parallelization, or no order at all. Whose guarantees are better? ) – Andrey.Kozyrev May 12 '16 at 07:33
  • I'm not sure what is meant by "app-level defined partial order." This seems to be some type of partition lower than the queue level, which would not really make sense to do in Rabbit since a queue can be defined to hold whatever combination of messages make sense. Unless I'm misunderstanding what you mean by that. – theMayer Jun 01 '16 at 16:52
  • Sorry for coming back to a 4 year old answer. Whilst the order of messages through one channel, one exchange, one queue and one channel may be preserved, things can go wrong on the client. For instance the C# client library spawns a thread per received message. With my mean-spirited mind I say, so how does OS scheduling then interfere with the order of processing received messages? Surely there is in effect a race condition between two threads processing two closely timed messages? The client lib won't wait for the first thread to complete before spawning the next. Makes debug hard. Any clues? – bazza Feb 01 '18 at 18:21
  • 2
    @bazza - Ask as a new question and I'll give a stab at it :) – theMayer Feb 01 '18 at 19:00
  • @theMayer, thanks for the offer! I've actually gone off and read the manual a bit more, and realised that all the code samples I've seen implement `IBasicConsumer` (the callbacks for which are indeed called concurrently, hence the threads; I don't quite believe the manual's statement about processing order; the tasks might get launched in order, but then it is down to the OS scheduler). Looks like the "pull" API `channel.BasicGet()` is more like it. Now all I need to find is the equivalent of `select()`... – bazza Feb 01 '18 at 23:01
  • It's possible they re-wrote the client implementation. The one I used years ago was horrid, and I hacked it myself (not high enough quality to re-contribute). It would not surprise me if that's what they did because I did the same thing myself. Messaging is not designed for serial processing, plain and simple - so don't count on it when you design your app. – theMayer Feb 02 '18 at 04:52
  • @bazza - the other thing you could do is set auto-ack to false, or pre-fetch to 1. That will ensure serial delivery, though again, it's not a good idea to design serial requirements using a parallel platform structure. – theMayer Feb 02 '18 at 04:55
  • 2
    @theMayer It depends; some of the uses I've seen other devs use RabbitMQ for are simply as a tcp socket replacement with no hint of parallel messaging architectures. The other features that RabbitMQ provides (like durability, etc) are attractive all by themselves! Used like that (so not exercising the messaging patterns at all, which is almost a travesty) it's useful to have message order preserved. – bazza Feb 02 '18 at 07:42
  • @theMayer, and thank you for the tips on auto-ack and pre-fetch. – bazza Feb 02 '18 at 07:44
  • I’ll agree with that. At least you have solid rationale. – theMayer Feb 02 '18 at 13:44
9

Message ordering is preserved in Kafka, but only within partitions rather than globally. If your data need both global ordering and partitions, this does make things difficult. However, if you just need to make sure that all of the same events for the same user, etc... end up in the same partition so that they are properly ordered, you may do so. The producer is in charge of the partition that they write to, so if you are able to logically partition your data this may be preferable.

tyler neely
  • 91
  • 1
  • 1
9

I think there are two things in this question which are not similar, consumption order and processing order.

Message Queues can -to a degree- give you a guarantee that messages will get consumed in order, they can't, however, give you any guarantees on the order of their processing.

The main difference here is that there are some aspects of message processing which cannot be determined at consumption time, for example:

  • As mentioned a consumer can fail while processing, here the message's consumption order was correct, however, the consumer failed to process it correctly, which will make it go back to the queue. At this point the consumption order is intact, but the processing order is not.

  • If by "processing" we mean that the message is now discarded and finished processing completely, then consider the case when your processing time is not linear, in other words processing one message takes longer than the other. For example, if message 3 takes longer to process than usual, then messages 4 and 5 might get consumed and finish processing before message 3 does.

So even if you managed to get the message back to the front of the queue (which by the way violates the consumption order) you still cannot guarantee they will also be processed in order.

If you want to process the messages in order:

  1. Have only 1 consumer instance at all times, or a main consumer and several stand-by consumers.
  2. Or don't use a messaging queue and do the processing in a synchronous blocking method, which might sound bad but in many cases and business requirements it is completely valid and sometimes even mission critical.
engma
  • 1,849
  • 2
  • 26
  • 55
  • 2
    This is accurate information, but there is no practical significance of "consumption order." Message *processing* is what results in a change of state in the system. As messages can be re-queued after "consumption" but ostensibly before "processing", all this deals with is the temporary state of the processor before it is done - which hopefully you don't care about. – theMayer Jan 30 '18 at 17:18
  • I would also add that if you go with option one and you are receiving events from RMQ in passive mode and you use an event loop like in NodeJS you need to use a single channel with a prefetch of 1 because otherwise you may end up with multiple messages in parallel which may be processed at different speeds. – Aalex Gabi Mar 27 '18 at 18:16
  • One way to solve this is to have a order number with message, and consumer(s) keep track of messages and their order in DB. If consumer A haven't finished processing of message-order-1 for entity-record-id-100, consumer B shouldn't start processing the message-order-2 for entity-record-id-100. Consumer B should wait and retry in a loop with wait. – Vikash Jul 28 '21 at 06:43
1

There are proper ways to guarantuee the order of messages within RabbitMQ subscriptions.

If you use multiple consumers, they will process the message using a shared ExecutorService. See also ConnectionFactory.setSharedExecutor(...). You could set a Executors.newSingleThreadExecutor().

If you use one Consumer with a single queue, you can bind this queue using multiple bindingKeys (they may have wildcards). The messages will be placed into the queue in the same order that they were received by the message broker.

For example you have a single publisher that publishes messages where the order is important:

try (Connection connection2 = factory.newConnection();
        Channel channel2 = connection.createChannel()) {
    // publish messages alternating to two different topics
    for (int i = 0; i < messageCount; i++) {
        final String routingKey = i % 2 == 0 ? routingEven : routingOdd;
        channel2.basicPublish(exchange, routingKey, null, ("Hello" + i).getBytes(UTF_8));
    }
}

You now might want to receive messages from both topics in a queue in the same order that they were published:

// declare a queue for the consumer
final String queueName = channel.queueDeclare().getQueue();

// we bind to queue with the two different routingKeys
final String routingEven = "even";
final String routingOdd = "odd";
channel.queueBind(queueName, exchange, routingEven);
channel.queueBind(queueName, exchange, routingOdd);
channel.basicConsume(queueName, true, new DefaultConsumer(channel) { ... });

The Consumer will now receive the messages in the order that they were published, regardless of the fact that you used different topics.

There are some good 5-Minute Tutorials in the RabbitMQ documentation that might be helpful: https://www.rabbitmq.com/tutorials/tutorial-five-java.html

benez
  • 1,856
  • 22
  • 28