1

Note: This question comes fairly close to mine, but I could, at least, use some working example to the provided solution and maybe ZeroMQ brings some magic I just don't know about.

Currently I react on exceptions in blocking ZeroMQ calls like this:

try {
    zmq::poll(&items, number, timeout);
} catch (zmq::error_t &ex) {
    if (ex.num() != EINTR) {
        throw;
    }
}
...

My intention is: rethrow all caught exceptions but those triggered by an interrupted system call, which I can usually ignore (e.g. SIGPROF) and just restart zmq::poll.

In case of SIGINT (CTRL-C) I want to proceed differently (e.g. also rethrow or terminating a loop).

Currently my best bet is to install a signal handler listening to SIGINT but since ZeroMQ catches the signal on it's own, I'd prefer a more sophisticated approach.

Community
  • 1
  • 1
frans
  • 8,868
  • 11
  • 58
  • 132
  • I grepped through the code and it doesn't look like ZMQ actually catches signals except for the z/OS operating system. `EINTR` is one of the values for `errno`, indicating that an underlying system call failed (and can possibly be retried.) Comments in the `NEWS` file seem to indicate that ZMQ stopped catching signals because they couldn't do it in a way that was compatible with other software that themselves were expecting to catch `SIGINT`. – scooter me fecit Mar 23 '16 at 22:37
  • 1
    BTW -- quick test: Check if the return value from `signal()` is `SIG_DFL` or `SIG_IGN` when you install your `SIGINT` handler. If so, that's a strong indication that ZMQ isn't installing its own handler. – scooter me fecit Mar 23 '16 at 22:41
  • May I ask in this case, what is the source of the signal? Signals are a way of communicating between processes or threads. ZeroMQ is also a way of communicating between processes or threads. It's far cleaner to stick to just one way, otherwise you run into problems like this. If the signals are yours then I would replace them with ZMQ sockets. If the source of the signals is external and unavoidable, that's unfortunate; the suggestions from @ScottM sound sensible. – bazza Mar 24 '16 at 06:43
  • @bazza: Wrong type of signal; it's an overloaded term in this context. The OP means Unix signals, i.e., exceptional situations, such as `SIGNINT`, `SIGSEGV`, etc. Those aren't used to communicate across threads. – scooter me fecit Mar 24 '16 at 22:44
  • @ScottM I know the OP was referring to Unix signals, and they are used for communication of information. The meaning of SIGINT and most other signals is only non-arbitrary if you don't install a handler. And, are you suggesting that you can't have thread specific signal masks? Regardless, it's far easier to have a separate ZMQ socket acting as a command channel instead of using signals (such as SIGINT) to impose a disruptive command infrastructure on top of ZMQ's Actor model framework. The OP is trying (unnecessarily) to use SIGINT for such a purpose. – bazza Mar 25 '16 at 08:39
  • @bazza: I'm not trying to use signals as a way to communicate but I want to react on `CTRL-C` appropriately: in this case `EINTR` means "abort" in all other cases I want to ignore it and restart the aborted command (`zmq::poll` in my case). So in other words: I just want to react on `CTRL-C` preferably without having to install a signal handler (because I'm working on a library and don't want to modify the general signal handling strategy) – frans Mar 25 '16 at 11:50
  • Hello @frans, if you're library is not controlling the signalling handling strategy then you'll need the application to not mask off SIGINT. You'd still be restricting the application's use of signals. Also you would have to handle EINTR everywhere in your code (zmq_recv, zmq_send, etc. etc), not just on zmq_poll (signals are asynchronous, they don't wait for your call to zmq_poll). It would be better if your "abort" command were a message delivered down a ZMQ socket included in the set of sockets being polled. You could include stdin in the poll too to abort on a keypress. – bazza Mar 25 '16 at 16:06
  • @frans, @bazza: The main issue that you are going to run into is throwing exceptions from a signal handler. `EINTR` just means that the system call was interrupted. It doesn't necessarily mean that a `signal` handler executed -- you still have to catch `SIGINT`, potentially tweak some state (hopefully atomically), then raise the C++ exception. Raising an exception in a `signal` handler is not recommended and very much undefined. Basically, your approach is unlikely feasible. – scooter me fecit Mar 25 '16 at 17:34
  • @frans: Do Not Try to raise a C++ exception from inside a Unix or Linux `signal` handler. Seriously. You can do this on Windows because the Structured Exception Handler (SEH) machinery is designed to do this. Unix `signal` is not. – scooter me fecit Mar 25 '16 at 17:38
  • @ScottM, frans doesn't want to use a signal handler at all (see frans' latest comment above), and I'm advocating avoiding the use of signals altogether to achieve the required result. Frans' code snippet is not in a signal handler, it's at the top of a loop. But yes, in general it's not a good idea to throw things from a signal handler. – bazza Mar 25 '16 at 19:15
  • @bazza: I realize that @frans wants to avoid the unavoidable, but it can't be avoided for `SIGINT` handling. My suggestion (caveat) is not to try and throw an exception from inside the signal handler. ZMQ is no help in this use case. @fran's other problem is that `signal` is being conflated with C++ exceptions; they are two very different and incompatible capabilities. – scooter me fecit Mar 26 '16 at 03:45
  • @frans: It doesn't look like `libzmq` catches `SIGINT`, so you're current solution looks like the best solution. – scooter me fecit Mar 26 '16 at 03:47

3 Answers3

2

If I read the original poster's question correctly, @frans asks if there's a way to re-throw certain exceptions where the C++ exception contains the EINTR error code, except those generated by certain signals. @frans currently has a SIGINT signal handler and wonders if there's a cleaner way.

There two different questions being posed, about POSIX signal() handling and their interaction with C++ exceptions:

  • POSIX signals are not related to C++ exception handling.
  • zmq::error_t is a C++ exception generated by zmq::poll() as the result of a system call returning EINTR.

TL;DR answer: No, there's no cleaner way.

libzmq does not appear to install its own signal handlers, but it will throw a zmq::error_t containing EINTR if a an underlying system call is interrupted (i.e., poll() returned -1, errno copied into the zmq::error_t exception.) This could mean that a POSIX signal could have been delivered and a process-specific handler run, but there are other reasons.

POSIX/Unix/Linux/BSD signal() is an operating system facility that indicates that something unusual happened down in the kernel. Processes have the option of installing their own handler to recover from the situation, e.g., SIGINT and SIGQUIT handlers to close file descriptors, do various types of cleanup, etc. These handlers are in no way related to C++ exception handling.

Other caveat: DO NOT throw C++ exceptions from inside a POSIX/Unix/Linux/BSD signal handler. This is previously discussed in this SO topic.

Community
  • 1
  • 1
scooter me fecit
  • 1,053
  • 5
  • 15
0

Ok, so there's three ways of doing this, one easy, two messy

Clean

The easy way runs something like this:

zmq::socket_t inSock;
zmq::pollitem_t pollOnThese[2];
int quitLoop = 0;

<code to connect inSock to something>

pollOnThese[0].socket = NULL;       // This item is not polling a ZMQ socket
pollOnThese[0].fd = 0;              // Poll on stdin. 0 is the fd for stdin
pollOnThese[0].event = ZMQ_POLLIN;

pollOnThese[1].socket = &inSock;    // This item polls inSock
pollOnThese[1].fd = 0;              // This field is ignored because socket isn't NULL
pollOnThese[1].event = ZMQ_POLLIN;

while (!quitLoop)
{
   zmq::poll(pollOnThese,2);

   if (pollOnThese[0].revents == ZMQ_POLLIN)
   {
      // A key has been pressed, read it
      char c;
      read(0, &c, 1);

      if (c == 'c')
      {
         quitloop = 1;
      }
   }
   if (pollOnThese[1].revent == ZMQ_POLLIN)
   {
      // Handle inSock as required
   }
}

Of course this means your "abort" is no longer the user pressing CTRL-C, they just press the key 'c', or indeed any program that can write to this stdin can send a 'c'. Alternatively, you could also add another ZMQ socket as a command channel. There's no tolerance to signals at all here, but I've always found that mixing use of signals with Actor model programming is a very awkward thing to do. See below.

Messy

The messy way using signals looks something like this:

zmq::socket_t inSock;
zmq::pollitem_t pollOnThis;
int quitLoop = 0;

<code to connect inSock to something>

<install a signal handler that handles SIGINT by setting quitLoop to 1>

pollOnThis.socket = &inSock;    // This item polls inSock
pollOnThis.fd = 0;              // This field is ignored because socket isn't NULL
pollOnThis.event = ZMQ_POLLIN;

while (!quitLoop)
{
   try 
   {
      zmq::poll(&pollOnThis, 1);
   }
   catch (zmq::error_t &ex) 
   {
      if (ex.num() != EINTR) 
      {
         throw;
      }
   }

   if (pollOnThis.revent == ZMQ_POLLIN && !quitLoop)
   {
      // Handle inSock as required
      bool keepReading = true;
      do
      {
         try
         {
            inSock.recv(&message)
            keepReading = false;
         }
         catch (zmq::error_t &ex) 
         {
            if (ex.num() != EINTR) 
            {
               throw;
            }
            else
            {
               // We also may want to test quitFlag here because the signal, 
               // being asynchronous, may have been delivered mid recv()
               // and the handler would have set quitFlag.
               if (quitFlag)
               {
                  // Abort
                  keepReading = false;
               }
               else
               {
                  // Some other signal interrupted things
                  // What to do? Poll has said a message can be read without blocking.
                  // But does EINTR mean that that meassage has been partially read?
                  // Has one of the myriad of system calls that underpin recv() aborted?
                  // Is the recv() restartable? Documentation doesn't say. 
                  // Have a go anyway.
                  keepReading = true;
               }
            } // if
         } // catch
      }
      while (keepReading);

      <repeat this loop for every single zmq:: recv and zmq::send>
   }
}

Hybrid (Also Messy)

In the ZeroMQ guide documentation here they're sort of blending the two idea, by having a signal handler write to a pipe and including the pipe in the zmq_poll much as I have done above with stdin.

This is a common trick to turn asynchronous signals into synchronous events.

However there's no hint at all as to whether any of the zmq routines can be restarted. They're simply using it as a way to initiate a clean shutdown having abandoned any recv() or send() that are in progress. If in order to cleanly abort you need to finish off a sequence of recv() and send() routines then there's no guarantee that that is possible if signals are being used.

Forgive me if I'm wrong, but it feels like you are re-purposing SIGINT away from meaning "terminate the whole program immediately". If so it's likely that you'll be wanting to resuming your communications loop at a later time. In which case, who knows whether any of the recv() or send() calls are resumable after the arrival of your signal. There's a whole bunch of system calls that zmq will be using. Many of them are not-restartable under certain circumstances, and there's no telling how ZMQ has used these calls (short of reading their source code).

Conclusion

Basically I would simply steer clear of using signals altogether, especially as you don't want to install your own handler. By using stdin or a pipe or another ZMQ socket (or indeed all three) as an 'abort' channel included in zmq_poll() you would be providing a simple and effective means to abort your loop, and there would be no complications resulting from its use.

bazza
  • 7,580
  • 15
  • 22
0

I ran into a similar problem and have stared at this question and its answers for a long time, hoping a solution would magically appear if I stared hard enough.

In the end, I went with the approach detailed in this article.

The bottom line is that we need a ppoll or rather pselect based ZMQ polling mechanism (I implemented one, but I still need to put it in a separate library). These p- variants take a signal mask as an extra argument. They set the signal mask, run the regular poll/select and then reset the signal mask to its previous state. This means that outside ppoll/pselect you can block all signals you are expecting (blocking them will cause them to get queued by the OS, they won't get lost) and then only unblock them during poll/select.

What this effectively does is add a "signal socket" to your poller. The poller will return (with 0) either when it gets a signal from an actual socket or (with -1) when it is interrupted by the signals you allowed it to be interrupted by using your signal mask. You can use a signal handler to set some flag.

Now, right after your poll call, when you would normally go through the sockets that are now readable/writeable, you first check the flag you set from your signal handler. The signals are blocked everywhere outside of ppoll/pselect, so you know with certainty that you need not check this flag anywhere else than here. This allows a very clean, compact and robust handling of EINTRs.

Some caveats: for this to work, you cannot use any other blocking calls, so all your sends and recvs should be non-blocking (ZMQ_DONTWAIT flag).

Another thing is what you already mentioned: in case of a library, the user should be free to install their own signal handler. So probably you'd need to instruct the user on how to properly handle signals in this way if they want to use your library in a robust way. I think it should be feasible if you expose the flag (or a function to flip it) to the user so that they can call it from their own signal handler.

egpbos
  • 502
  • 4
  • 15