0

I have requests coming in for different samples (s1, s2, ..) that need to be processed in a linear fashion (i.e. only one say s1-request at the time can be processed). I have N-number of worker services that can process given requests. How can I implement rpc-queue pattern so that the samples are consumed one at the time and still allow distribution of calculation between different samples?

I would like to implement this with rabbit-mq because of it's simplicity, clustering capabilities, but I'm willing to consider other solutions as well.

Here is a picture to illustrate the problem ( with two workers)

                               worker 1 
                            +-----------+
                            |           |
 input queue          +---->|           |-------+
+--------------+      |     |           |       |
|              |      |     +-----------+       |
| s1,s2,s1,s1  |------+                         |
|              |      |        worker 2         |
+--------------+      |     +-----------+       |
                      |     |           |       |
 output queue         +---->|           |-------+
+--------------+            |           |       |
|              |            +-----------+       |
|(s1,s2,s1,s1) |<-+                             |
|              |  +-----------------------------+
+--------------+
Fdr
  • 3,726
  • 5
  • 27
  • 41
  • 1
    Just add an extra layer of indirection, the solution to most problems :) Insert extra queues between the input queue and the workers that store only the type of object that must be processed serially. – Hans Passant Dec 30 '13 at 12:44
  • @HansPassant Can you expand that idea a bit? So I would declare a new queue per sample? Can I use some rabbit-mq feature here or should I implement a service to read the input queue to route the requests? – Fdr Dec 30 '13 at 15:01
  • Yes, no, yes :) Creating different message queues would make it really trivial. – Hans Passant Dec 30 '13 at 15:10

2 Answers2

2

A trivial Task Queue processing, supposing tasks remain non-intervening:

A ZeroMQ has smart discussions for this and for a bit more complex setups >>>

Check a formal behaviour model setup for Divide & Conquer

A simpler case (Fig-s: courtesy ZeroMQ/imatix)

at http://zguide.zeromq.org/page:all#Divide-and-Conquer

( Just for an inspiration, check also an extended approach with SIG_KILL add-on

A bit more complex one

)


n.b.: I have no ( rather Ø ) ZeroMQ affiliation, the same with imatix. However, after a lot of Projects, that work smart also due to this fabulous ZeroMQ-abstraction & architecture, IMHO I bet I can say, this is a horse-power one may only benefit from on high-performance, scale-able, low-latency, distributed systems.

user3666197
  • 1
  • 6
  • 50
  • 92
0

Hey have you checked out https://storm.incubator.apache.org its written in python I believe.

Iron.io can host your queues and distributed worker patterns to be executed on our platform in any language. IronWorker is also backed by a task queue that makes life pretty easy for you.

Hotel Tonight used the terminology ETL extact, translate, load for passing and transforming data through a pipeline.

http://engineering.hoteltonight.com/ruby-etl-with-ironworker-and-redshift

(I work for Iron.io just wanted to put some resources out there)

Stephen Nguyen
  • 5,357
  • 5
  • 24
  • 28