1

I'm using ZeroMQ with NACK-Oriented Reliable Multicast ( NORM ) norm:// protocol. The documentation contains only a Python code, so here is my C++ code:


PUB Sender :

string sendHost         = "norm://2,127.0.0.1:5556";// <NormNodeId>,<addr:port>
string tag              = "MyTag";
string sentMessage      = "HelloWorld";
string fullMessage      = tag + sentMessage;

zmq::context_t *context = new zmq::context_t( 20 );

zmq::socket_t publisher( *context, ZMQ_PUB );
zmq_connect(  publisher, sendHost.c_str() );

zmq_send(     publisher,
              fullMessage.c_str(),
              fullMessage.size(),
              0
              );

SUB Receiver :

char   message[256];
string receiveHost      = "norm://1,127.0.0.1:5556";// <NormNodeId>,<addr:port>
string tag              = "MyTag";

zmq::context_t *context = new zmq::context_t( 20 );

zmq::socket_t   subscriber( *context, ZMQ_SUB );
zmq_bind(       subscriber, receiveHost.c_str() );
zmq_setsockopt( subscriber, ZMQ_SUBSCRIBE, tag.c_str(), tag.size() );

zmq_recv(       subscriber,
                message,
                256,
                0
                );

cout << bytesReceived << endl;
cout << message << endl;

The problem I'm facing is that according to the documentation both .bind() and .connect() are interchangeable.

In my case they both do a .bind(), which causes ZeroMQ to throw an error saying the second bind fails, due to address already in use error.

user229044
  • 232,980
  • 40
  • 330
  • 338
Said A. Sryheni
  • 697
  • 1
  • 9
  • 29

1 Answers1

1

... they both do a bind, which causes ZeroMQ to throw an error saying the second bind fails

Yes, this is a correct state to fail.

The first .bind() "takes ownership" of the port and this is an exclusive role.

The interchangeability of { .bind() | .connect() } is to be understood so that it does not matter which side .bind()-s and which one .connect()-s.

Until this moment, I saw no one interpreting this property in such a manner, that both sides would try to .connect() ( a non-existent .bind()-(not)-exposed Access Point ), the less to try to .bind() an already "occupied" port ( in case of residing on the same localhost ), or to remain in a nox-et-solitudo state, for the cases that either of the .bind()-s establishes such a .connect()-ready state on both ports on different localhost-s, which both after that remain in a silent solitude ( forever ), as there is ( and will be ) no attempt to make any .connect()-ion going live and operational.

No, you need just 1 .bind(), that may since that moment handle 0+ future .connect()-requests, arriving to establish a live-channel PUB/SUB, for any respective <transport-class> protocol, including the newly added norm://.

Anyways, welcome norm:// to the Family of ZeroMQ protocols.


Confused ?

May enjoy a further 5-seconds read
about the main conceptual differences in [ ZeroMQ hierarchy in less than a five seconds ] or other posts and discussions here.

user3666197
  • 1
  • 6
  • 50
  • 92
  • Thanks for your reply. What I meant is that the code in it's current form fails for me. I added some debugging statements and here is what I got: When I run the receiver first, the bind goes successful. After that I run the sender, it fails on the connect statement saying that the address is already in use. Why does it say so? I mean it's a connect statement so why is it trying to bind? And what is the correct fix for this problem? – Said A. Sryheni Jun 08 '18 at 09:49
  • Some "hanging"-orphan ( that was not successfully dismounted from the port# ) might remain blocking any next attempt to access the port-related resources. If in extreme need, reboot the system so as to be sure the ports get reincarnated and free again. ( Have lost many hairs on re-wrapping code into fused-protection structures alike **`try: except: finally:`** because otherwise the crashed code might get the whole system locked until reboot ) Back to your comment - **what was the value of *`errno`* received right upon a failed attempt to `.connect()` ?** `{ EINVAL | EPROTONOSUPPORT | ... }` ? – user3666197 Jun 08 '18 at 10:32
  • First I run the receiver which contains the **bind** call. This bind is successful, and it returns the code 0 indicating that everything went fine. After that I start the sender. When the receiver reaches its **connect** call, it stops throwing the following error: `Proto Error: ProtoSocket::Bind() bind() error: Address already in use` `Proto Fatal: NormSession::Open() error: rx_socket.Bind() error` `Address already in use (src/session_base.cpp:681)` `Aborted (core dumped)` and the program stops. This is weird because the call is **connect** while the error is showing **bind** – Said A. Sryheni Jun 08 '18 at 10:52
  • So I don't get an **`errno`**, because the program throws an error and stops. – Said A. Sryheni Jun 08 '18 at 11:11
  • Understood. Thanks for your kind cooperation, Said. This shall not happen in case a host-`PUB` `.bind()`-s and another host-`SUB` `.connect()`-s. – user3666197 Jun 08 '18 at 12:20
  • I'm sorry I don't follow. Do you mean that I have to run each of the sender and receiver on two different hosts? Currently I'm running both of the them on the same machine and host. I just have 2 threads, where one of them is the sender and the second is the receiver. I synchronize them to make sure which one goes before the other. I tried to let the sender `PUB` **bind**, and the receiver `SUB` **connect**, but unfortunately I got the exact same error. – Said A. Sryheni Jun 08 '18 at 12:30
  • Yes, that is exactly what was meant to test / validate the rejected "2nd" `.bind()` in situation, where each of the `PUB/SUB` roles resides on a different host. – user3666197 Jun 08 '18 at 12:38
  • Isn't there any way to keep both the `PUB` and `SUB` on the same host? I tried using ZMQ with TCP on the same host and everything worked fine. But when using ZMQ with NORM the problem arises. – Said A. Sryheni Jun 08 '18 at 12:42
  • What I want is to use ZMQ with NORM in a multi-threaded environment, so that all the threads are on the same host. Is this impossible to do? – Said A. Sryheni Jun 08 '18 at 12:44
  • The intention is clear and fair, yet the proposal was to test / validate a certain behaviour. Ok, let's make several steps at once - try `SUB` side to make an explicitly mapped `.connect( "norm://,;localhost:5556" )` as NORM-documentation permits it to get mapped. – user3666197 Jun 08 '18 at 13:00
  • Excuse me if this should have been clearer to me, but could you give me an example on what to set the `sub_IP_Physical_Address`? I'm not sure I totally understand what do you mean by that. – Said A. Sryheni Jun 08 '18 at 13:14
  • Whatever physical interface address, which is present on the SUB-side host. – user3666197 Jun 08 '18 at 13:24
  • Well, the best next step is to define a `try{ ... connect( ... ) }catch{ ... }` to check the value of **`errno`** and report that as an issue to `norm://` maintainers. – user3666197 Jun 08 '18 at 13:37
  • The problem is that `catch` doesn't catch any exception. Also I just changed the connect into `cout << zmq_connect(subscriber, receiveHost.c_str()) << endl`. 0 has been printed indicating a successful connect. After that the application terminates immediately. I added a sleep command after `connect`, but the application doesn't sleep, it still terminates. So somehow I'm getting a successful code, but the application fails immediately after that printing the error to the console without performing any further commands. – Said A. Sryheni Jun 08 '18 at 13:43