How to avoid the same queue job being processed more than once when scaled across multiple dynos on Heroku

Question

We have a Node.js application running loopback, the main purpose of which is to process orders received from the client. Currently the entire order process is handled during the single http request to make the order, including the payment, insertion into the database and sending confirmation emails etc.

We are finding that this method, whilst working at the moment, lacks scalability - the application is going to need to process, potentially, thousands of orders per minute as it grows. In addition, our order process currently writes data to our own database, however we are now looking at third party integrations (till systems) over which we have no control of the speed or availability.

In addition, we also currently have a potential race condition; we have to assign a 'short code' to each order for easy reference by the client - these need to rotate, so if the starting number is 1 and the maximum is 100, the 101st order must be assigned the number 1. At the moment we are looking at the previous order and either incrementing the previous reference by 1 or setting it back to the start - obviously this is fine at the moment due to the low traffic - however as we scale this could result in multiple orders being assigned the same reference number.

Therefore, we want to implement a queue to manage all of this. Our app is currently deployed on Heroku, where we already use a worker process for some of the monthly number crunching our app requires. Whilst having read some of the Heroku articles on implementing a queue (https://devcenter.heroku.com/articles/asynchronous-web-worker-model-using-rabbitmq-in-node, https://devcenter.heroku.com/articles/background-jobs-queueing) it is not clear how, over multiple worker dynos, we would ensure the order in which these queued items are processed and that the same job is not processed more than once by multiple dynos. The order of processing is not so important, however the lack of repetition is extremely important as if two orders are processed concurrently we run the risk of the above race condition.

So essentially my question is this; how do we avoid the same queue job being processed more than once when scaled across multiple dynos on Heroku?

andreaciri · Answer 1 · 2017-03-01T15:03:14.237

3

What you need is already provided by RabbitMQ, the message broker used by the CloudAMQP add-on of Heroku.

You don't need to worry about the race condition of multiple workers. A job placed onto the queue is stored until a consumer retrieves it. When a worker consumes a job from the queue, no other workers will be able to consume it. RabbitMQ manages all such aspects of message queing paradigm.

A couple of links useful for your project:

edited Mar 01 '17 at 15:03

answered Mar 01 '17 at 13:58

andreaciri

452
5
12

See http://stackoverflow.com/questions/35946177/node-js-message-queue-on-heroku/36029289#36029289 for instructions on exactly how to set this up with node.js on Heroku. – Yoni Rabinovitch Mar 01 '17 at 16:26
This answer does not solve the "exactly once" delivery required by OP. – idbehold Mar 01 '17 at 18:09
@idbehold The queue ensures the "exactly once" delivery asked for. Tom, rotation of the 3 digit code is problematic with a decimal encoding. You can only have 999 theoretical open orders at a time. A slightly better option would be use a different base encoding. For instance, if you used a base58 encoding (http://stackoverflow.com/a/18949941/673882) You'd increase that number to over 195,000 combinations using only three characters. If you mean those reference numbers are stored permanently to refer to the internal order number, that solution just isn't going to work. – Nathan Loyer Mar 08 '17 at 20:46
@NathanLoyer [RabbitMQ offers either "at-least-once" or "at-most-once" delivery](https://www.rabbitmq.com/reliability.html). No distributed messaging system can guarantee "exactly-once" delivery. – idbehold Mar 08 '17 at 21:40
@idbehold I see the distinction, but the original OP ends with "So essentially my question is this; how do we avoid the same queue job being processed more than once when scaled across multiple dynos on Heroku?" It sounds like "at-most-once" is the closest practical solution given the inferred level of technical ability. If a guaranteed "exactly once" is mandatory it seems to be more of an architectural solution that is needed. – Nathan Loyer Mar 08 '17 at 21:52

How to avoid the same queue job being processed more than once when scaled across multiple dynos on Heroku

1 Answers1