How ZMQ works between 2 different machines

Question

I am not sure I understand fundamentally how ZMQ (or any message queue) knows how to communicate between two servers that otherwise don't know anything about each other.

For example, using the request/reply pattern:

the requester will bind to a host and port like so:

var requester = zmq.socket("req");
 requester.bind('tcp://*:5555'), function (err) {
            callback(err);
        });

in another node.js process on another server, I use the connect function:

 var replier = zmq.socket('rep');
    replier.connect('tcp://127.0.0.1:5555', function (err) {
       callback(err);
    });

But what I don't understand is how the requester knows where to send the message if the replier is in a different process on a totally different server pretty much anywhere?

Aren't you mixing up the requester and reply? looking at your code, the requester is listening and the reply is connecting — dmaij, May 07 '15 at 19:33

Jason · Accepted Answer · 2015-05-08T15:47:21.900

You already figured out one issue in the comments - connect() is synchronous, and you attempted to use it asynchronously.

I would suggest, in node.js, that you both bind() and connect() synchronously, you get little benefit from running this one-time start-up action asynchronously, and it just makes the code clearer to do it synchronously. If you're building up and tearing down sockets mid-process, and you honestly have good reason to do so, then you can ignore this advice, but node gives you good reason to do this only once and use the same sockets for the lifetime of the process.

As for how two different servers can find each other, your example will fail:

// this tells the socket to listen to all incoming connections on post 5555
// but it does not create a connection to any other machine or process
requester.bindSync('tcp://*:5555');

... and on the other machine...

// this tells the socket to connect to a bound socket on the same machine
// it will not find a socket on another machine
replier.connect('tcp://127.0.0.1:5555');

So, you must either reverse the bind() and connect(), and then change your requester, as in the following:

// change 111.222.33.44 to the IP address or DNS name of your other machine
requester.connect('tcp://111.222.33.44:5555');

... and on the other machine...

replier.bindSync('tcp://*:5555');

... or, change your connect() call to specify the IP address of the first machine, rather than the loopback address.

What follows is an evaluation of which side you should connect() or bind() on, since I feel the other advice is not complete.

It doesn't matter which side you bind() or connect() on, so long as you bind() on the persistent side (your "server") and connect() on the transient side (your "client"). If they are equally persistent, then the next way to choose is to bind() on the side that "owns" the data, the way a server does, and connect() on the side that "wants" the data, the way a client does.

This is why, traditionally, you'll bind() the REP socket and connect() the REQ socket, as the REQ socket "wants" the data it is requesting, and the REP socket "owns" the data its sending back. Similarly, you'll bind() a PUB socket, since it "owns" the data it's publishing, and you'll connect() a SUB socket, since it "wants" the data it's subscribed to.

This is all just a rule of thumb, it's perfectly possible for a SUB socket to be more persistent than it's companion PUB socket, or for a REQ socket to "own" the data it's sending to the REP socket. And in many cases, you can choose either side with zero consequences anyway, but these are useful rules to follow to have some clarity about what's going on.

excellent answer. frankly, I think there should be an async version of connect( )...you can then be sure the other service/server is available — Alexander Mills, May 08 '15 at 17:23

score 1 · Answer 2 · answered May 07 '15 at 19:58

1

I think you are confused with which is the requester and which is the reply-er. Or more appropriately which one is the client and which one is the server.

The Server is the bind aka socket.bind('tcp://*:5555'). There is nothing magical about this. That asterisk does not mean multicast or discover servers on some network. It means bind aka listen on all network devices for the machine.

The client is your probably misnamed reply.connect('tcp://127.0.0.1:5555 ....

The client knows where the server just like in HTTP. (Hint its on the same machine :)).

ZeroMQ is very simple. It is not even really a message queue. It does not have auto discovery or a central broker so you have to do the book keeping of which is the server and client yourself (e.g. Major Domo pattern).

answered May 07 '15 at 19:58

Adam Gent

47,843
23
153
203

are you equating the server with the requester and the replier with the client? Or the reverse? – Alexander Mills May 07 '15 at 20:04
I renamed the socket vars to replier and requester – Alexander Mills May 07 '15 at 20:06
@AlexMills The bind is the server. You can clearly see even in the official documentation / code example that I linked to it is. The client is simply connecting to the server which is on the local machine. The server is listening on all network devices including the loopback (127.0.0.1). The client is simply connecting to the loopback. – Adam Gent May 07 '15 at 20:10
adam how can you explain this? http://stackoverflow.com/questions/6024003/why-doesnt-zeromq-work-on-localhost – Alexander Mills May 07 '15 at 21:05
my current hypothesis is that I need a zmq.context – Alexander Mills May 07 '15 at 21:16
I figured out the problem. socket.bind( ) seems to fire a callback, but socket.connect( ) was not firing a callback. This was messing up my code, since no callback was fired, my code just waited. – Alexander Mills May 07 '15 at 21:29
I am going to submit an issue on github, socket.connect should fire the callback if you pass it one. – Alexander Mills May 07 '15 at 21:31
also, perhaps with the pub/sub pattern socket.bind and socket.connect are the reverse of req/rep pattern, confusing to say the least – Alexander Mills May 07 '15 at 21:43
1

I'm curious what you are planning to use zeromq. IMHO I see most people pick zeromq when they really should pick something else like accessing the data repository directly or using a real message queue or even just using plain HTTP. The question is: Do you really have a firehose of data coming in that you need to do all the leg work that zeromq requires? – Adam Gent May 08 '15 at 12:57
adam at first my office was going to use redis pubsub for this requirement, but a restriction made that impossible because a library called twemproxy doesn't support redis's pubsub featuers. ZMQ seemed like a good choice in lieu of the discovery that we couldn't use redis pubsub. the req/reply pattern is what we really need at the moment. I don't have enough background in this to say why ZMQ isn't a real message queue. – Alexander Mills May 08 '15 at 17:34
so, adam, the 2 primary paradigms that I discovered in my limited MQ research was that some message queues were distributed and some were centralized. I wanted one that was decentralized/distributed and ZMQ fits that bill. Redis pubsub of course is centralized. If i am not mistaken, this the most important way to differentiate MQs besides perhaps raw performance which seems to be a related problem. Please correct me if I am mistaken. – Alexander Mills May 08 '15 at 17:43
2

You're more or less correct. ZMQ is pretty much a "build your own MQ" toolset. There are more complete (and opinionated) out of the box options, to Adam's point, that are probably the right choice when you want what they offer, or close enough to it. I went down the ZMQ path for the same reason you've chosen it, the decentralized/distributed nature. – Jason May 08 '15 at 21:37
at this point we just need point2point communication, we don't need a broker. zmq I suppose can be considered brokerless if it's out of the box. having a broker and keeping that running 24/7 is actually more of headache than p2p. ZMQ to the rescue. thanks for the info guys. I also didn't find ZMQ to be that difficult to install using homebrew for OSX or yum for CentOS. – Alexander Mills May 08 '15 at 23:51
Actually a broker is far easier to deal with since it basically provides automatic elasticity, discovery, load balancing, QoS and general consistency at the cost of partition weakness otob. But then again your dealing with req/repl which has a whole bunch of other CAP issues that even a broker can't help with. Fire and forget is far easier than request/reply and is generally less resource intensive even for nonblocking platforms (you may not run out of threads but you will run out of something if the replies are too slow) – Adam Gent May 09 '15 at 13:01
It's easier if those are things you need, where the broker provides them. It's a specific, if broad, use case, but it's not universal, and trying to fit it to everything is an anti-pattern. – Jason May 11 '15 at 13:36

How ZMQ works between 2 different machines

2 Answers2