3

When running the ZeroMQ basic PUB / SUB C# samples they are working for me if I start the publisher first, but not if I start the subscriber first. When I do that the subscriber starts, but never receives any data. From what I have read, I thought that I should be able to start the processes in either order.

I am using the ZeroMQ 4.1.0.26 package from nuget in .NET 4.6, x64 apps. These are running on Windows. I am running both apps on the same machine.

Here are is the code I am running (which is a simplified version of the sample from the ZeroMQ tutorial).

Subscriber:

    static void Main(string[] args)
    {
        var endpoint = "tcp://127.0.0.1:5556";

        // Socket to talk to server
        using (var context = new ZContext())
        using (var subscriber = new ZSocket(context, ZSocketType.SUB))
        {
            Console.WriteLine("I: Connecting to {0}…", endpoint);
            subscriber.Connect(endpoint);

            // Subscribe to zipcode
            string zipCode = "90210 ";
            Console.WriteLine("I: Subscribing to zip code {0}…", zipCode);
            subscriber.Subscribe(zipCode);

            while(true)
            {
                using (var replyFrame = subscriber.ReceiveFrame())
                {
                    string reply = replyFrame.ReadString();
                    Console.WriteLine(reply);
                }
            }
        }
    }

Publisher:

    static void Main(string[] args)
    {
        using (var context = new ZContext())
        using (var publisher = new ZSocket(context, ZSocketType.PUB))
        {
            var address = "tcp://*:5556";
            Console.WriteLine("I: Publisher.Bind'ing on {0}", address);
            publisher.Bind(address);

            // Initialize random number generator
            var rnd = new Random();
            while (true)
            {
                // Get values that will fool the boss
                int zipcode = 90210;
                int temperature = rnd.Next(-55, +45);

                // Send message to all subscribers
                var update = string.Format("{0:D5} {1}", zipcode, temperature);
                using (var updateFrame = new ZFrame(update))
                {
                    publisher.Send(updateFrame);
                }

                Thread.Sleep(1000);
            }
        }
    }

Edit

Following suggestions in proposed answers below I tried:

  • Using an explicit IP address: this made no difference
  • Removing the subject filtering: this made no difference
  • Creating new AnyCPU (instead of x64) projects: this made no difference
  • Trying other languages: this was interesting!

Using Python equivalent publishers and subscribers:

  • The Python subscriber works when started before the publisher, when connecting to either the Python publisher or the C# publisher.
  • The C# subscriber does not work when started before the publisher, regardless whether it is connecting to the Python or C# publisher.

So it looks like there is something wrong with the C# subscriber code.

QUESTION:

  • Is there something wrong with my sample code ( latest version below )?

  • Or is it a problem with the ZeroMQ .NET library?

Here is the Python subscriber which worked correctly:

import sys
import zmq

#  Socket to talk to server
context = zmq.Context()
socket  = context.socket( zmq.SUB )

print( "Python: Collecting updates from weather server" )
socket.connect( "tcp://localhost:5556" )

socket.setsockopt_string( zmq.SUBSCRIBE, "" )

while True:
    string = socket.recv_string()
    zipcode, temperature = string.split()
    print( zipcode + " " + temperature )

Here is the latest equivalent ( non-working ) version of the C# subscriber:

static void Main(string[] args)
{
    // Socket to talk to server
    using (var context = new ZContext())
    using (var socket = new ZSocket(context, ZSocketType.SUB))
    {
        Console.WriteLine("C#: Collecting updates from weather server");
        socket.Connect("tcp://localhost:5556");
        socket.Subscribe("");

        while (true)
        {
            using (var replyFrame = socket.ReceiveFrame())
            {
                string reply = replyFrame.ReadString();
                Console.WriteLine(reply);
            }
        }
    }
}
user3666197
  • 1
  • 6
  • 50
  • 92
Richard Shepherd
  • 1,300
  • 17
  • 20
  • 1
    netmq (another port of zeromq to .net) does this correctly by the way. I'd open an issue here: https://github.com/zeromq/clrzmq4 and see what they'd say, your code looks totally valid for me. – Evk Nov 05 '17 at 20:43
  • does it work if you subscribe to empty string? – somdoron Nov 06 '17 at 05:14
  • @somdoron Hi somdoron, the "subscribe-to-anything" mode was tested by OP and was reported that it did not work in C#, whereas the python client worked well ( ref. comments on root-cause-isolation tests below + hope Richard was well aware, that removing the topic-filter actually means a need to explicitly subscribe to an empty ""-string ). The issue seems to get so far discriminated down to the actual version of the ZeroMQ C# binding Richard is using, as demonstrated by the SUB-side contrast { !working | working } between C#, resp. python SUB-implementations. – user3666197 Nov 06 '17 at 10:09
  • Well, ZeroMQ/clrzmq4 is just a binding around ZeroMQ/libzmq; You should try to replace amd64/libzmq.dll/.so with your version of the library. I suspect you're running Windows, where the latest update was sadly made to libzmq 4.2.x, which is actually a beta version. Please do make ZeroMQ/zeromq4-1 (or copy the one from your Python binding) and try again... – metadings Nov 06 '17 at 14:32

1 Answers1

0

If you would follow, step-by-step the flow-chart re-published here, the problem of the PUB/SUB gets highlighted.

Recent ( EoY-2017 ) API v4.x re-engineering efforts have changed some design approaches ( whether the topic-filter gets processed on the SUB-side, as it was since v2.1+, or --as in the newer API versions-- at the PUB-side, reversing the overhead distribution/concentration, exgress-flow buffering+traffic/processing-overhead ), yet, the PUB/SUB order of operations seems to remain the same ( ref. the flow-chart mentioned above ).


Ex-post root-cause isolation testing:

Python SUB works correctly, my C# one does not. Not clear WHY:

Thanks for active code-re-testing, that helped to move a few steps forwards in the root-cause discrimination.

Now let's focus on the WHY:

ZeroMQ has typically two abstract layers, the ZeroMQ internal one, implementing the core-facilities of the framework, as defined in the ZeroMQ protocol RFC-specifications. This is the part, that is responsible for all RFC-defined interactions to become compliant and cross-compatible, so as to serve in the ZeroMQ published manner.

The other layer is constituted by a "foreign"-language wrapper / API binding, that helps other languages, non-native to the ZeroMQ internal code, to use the services exposed by the ZeroMQ API for any particular 3rd-party language mediation, so as the "foreign"-worlds get chance to call the services, that are implemented by the library, inside the world of the internal ZeroMQ layer.

So, the discrepancy was isolated using the python SUB-probe ( which worked well, as defined ) to the C# language binding, where the SUB-probe remained not working.

As Dijkstra noted on testing, a fact that some tests were working as expected does not mean there are no other bugs / errors in the system, there might be some more troubles with the binding, however this positive acknowledgement from python SUB-client showed, that ZeroMQ core-services are not those to get blamed for causing the error.


A1: No,

your code seems ok.

A2: Yes,

as explained above, this is the main suspect ATM

Reporting a request to the C# binding maintainers to fix the ZeroMQ non-conformant behaviour, documented in full details as observed and re-tested above, is the best next step to help the non-compliant C# binding get working hassle-free.

user3666197
  • 1
  • 6
  • 50
  • 92
  • 1
    (not the OP). I followed chart and end at "See explanation of slow joiners in text". Didn't find that in your answer text but I know what's that about. But in this case PUB misses not _some_ messages which publisher might have sent before it connected, it misses _all_ messages (publisher sends 1 message every second), including those sent after publisher bound. Why is that? – Evk Nov 05 '17 at 18:06
  • OP here: I agree with the comment above from Evk, ie that I don't receive any messages. So I don't think this is just a slow-joiner problem (which I am happy to accept and deal with in other ways). – Richard Shepherd Nov 05 '17 at 18:09
  • 1
    **Step 0:** Best start to get all slow-joiner details from Pieter HINTJENS' book "Code Connected, Volume 1" available in **pdf here >>>** http://hintjens.wdfiles.com/local--files/main%3Afiles/cc1pe.pdf **Step 1:** Setup the .bind()/.connect() to operate on an explicit IP-address, not to depend on a wildcard-translation. Next so as to isolate a root-cause, get a topic-filter eliminated -- subscribe to all messages on SUB-side. **Step 2:** To isolate the C# binding, next test to .connect() also from another node ( be it a python or other language of choice ) to proof a proper message delivery. – user3666197 Nov 05 '17 at 18:44
  • 1
    Thank you for the comments user3666197 - I have done the experiments you suggest and edited my question. The summary is: my Python subscriber works correctly, my C# one does not. It's not clear to me why, though! – Richard Shepherd Nov 05 '17 at 20:05
  • Once another language binding's code ( here python ) works, my suspect would be the original language binding is not meeting a requirement for providing an indeed neutral ZeroMQ API mediation. **What happens in case a non-blocking `.poll()` is used to indeed see, whether a `{ "2" | "" }`-subscribed client indeed got into the delivery scheme and got any message into it's incoming queue buffer?** Similarly may test a setup of a parallel **`PAIR/PAIR`** socket archetype on another `port#` and thus validate another part of the chain-of-dependence on the `.ReceiveFrame()` method. – user3666197 Nov 05 '17 at 20:51