0

I was trying to use ZMQ - PUSH - PULL to build a distributed task processing system. This was east to do using JMS in Java with a Queue and a listener;Listeners which are free could take the message of the queue and execute it.Once the queue is distributed across nodes, this acts like a load balancer.

With ZMQ (using Python - don't want to use Celery now), I was trying out PUSH and PULL. With the Worker having different processing time. However even when a worker is free, tasks are going in strictly round robin fashion. That is irrespective of if a worker is free or not, the task goes in a round robin way.

Is there any way of simulating a distributed queue with ZMQ patterns so that, I can have a pool of workers 'polling' the queue in each node and which ever is free pulls the message from the queue and process it.

Alex Punnen
  • 5,287
  • 3
  • 59
  • 71
  • 1
    Yes. There are many examples of exactly what you're looking for [in the guide](http://zguide.zeromq.org/page:all), I highly suggest you read it. – Jason Jul 07 '15 at 14:58
  • I am reading it for some days now trying out - https://github.com/alexcpn/DisProcessor too, and it is working; The problem is blind round robin, problem is load balancing – Alex Punnen Jul 08 '15 at 06:48
  • 1
    You don't want round robin at all, you want a free worker "requesting" work when it's ready, and the broker "replying" to the first worker that requested work when it has some... make sure you're reading [chapter 4, reliable request/reply patterns](http://zguide.zeromq.org/page:all#reliable-request-reply), start with the [simple pirate pattern](http://zguide.zeromq.org/page:all#Basic-Reliable-Queuing-Simple-Pirate-Pattern) and keep reading from there until you see the pattern that most closely fits your scenario. – Jason Jul 08 '15 at 14:36

1 Answers1

2

As pointed out by 0MQ founder Pieter Hintjens in this answer, the PUSH-PULL mechanism is not a load balancer, but rather a simple round robin distributor. That's a typo in the docs that is still there.

That said, for the load balancing pattern you need to add a broker in the middle of your architecture. As pointed out by Jason in the comments, this is well explained in the official guide. There are also examples in Python.

enter image description here

The main idea is to have the workers sending a small "READY" message to the broker whenever they are free to receive more jobs. The broker in turns, keep "pointers" to free workers in a queue. When he receives a new job request from a client he also propagates the request to the first free worker in the queue, which gets popped out from the queue. As you can see in the picture above, the broker exploits ROUTER sockets in order to avoid a blocking behavior and to get proper load balancing. A small additional detail is that the broker does not poll the clients if there are not free workers in the queue.

This is the simplest way I am aware of for implementing a load balancing pattern with ZeroMQ. It is not exactly like "polling" for new jobs in the queue, but I think this is what you need. Also please beware that this is really the simplest way, that is, it is not reliable at all and it does not scale well as is. If you also need reliability, I suggest you to thoroughly read Chapter 4 of the official guide.

As a side note, maybe you should seriously consider Celery for this task. I am really in love with ZeroMQ, however this is exactly the kind of thing that Celery is very good at, and in my opinion it is not so difficult to learn, as someone may think.

lec00q
  • 170
  • 1
  • 9
  • I have also moved on to Celery for the protoype; Cant comment now regarding the answer as I have forgotten the zeromq parts; I need to check and revert – Alex Punnen Jan 20 '16 at 13:36
  • Dear @AlexPunnen, have you ever check it again? Happy to help anyway. Good luck! – lec00q Mar 29 '17 at 07:41