4

I want to build a system that has the following architecture:

+------------------+          +------------------+
| App1. 0mq client | <------> | App2. 0mq server |
+------------------+          +------------------+

where
App2 is a ZeroMQ server and it's a black box,
and
App1 is a ZeroMQ client, but it is in fact a frontend server. The frontend server will process some requests from the clients and then will communicate with the App2 server.

Given that:

  1. At any point in time any of the "servers" can go down or be restarted.
  2. I want to start any of the apps, even if the other app is not running.
  3. If App1 is started when App2 is down, I want to know when App2 is up.

Is it possible to implement 3. only using ZeroMQ builtins, or do I need to use a different mechanism to notify App1 that App2 is up ?

user3666197
  • 1
  • 6
  • 50
  • 92
Victor Dodon
  • 1,796
  • 3
  • 18
  • 27
  • Does App2 really needs to notify App1? Wouldn't it be sufficient if on a regular base App1 polled/checked whether App2 is back up again? – alk Jul 01 '16 at 09:53
  • Well, first let's agree that a *"server"* and *"client"* roles are **not meaningfull abstractions for modern distributed systems**. Rather important is to explicitly state and disambiguate, whether a "black box" indeed means you have a zero chance to add/modify such App-only or whether the whole host is locked from any additions, that one might propose to help on (1), (2) and (3), ok? In case one is capable of and having a reasonable degree of freedom, ( *almost* ) anything can be done with ZeroMQ tools. – user3666197 Jul 01 '16 at 11:13
  • Victor, are you ready to decide on the value of the answers provided and accept the best one? This is the way StackOverflow works, isn't it? – user3666197 Nov 18 '16 at 16:07
  • Still not decided, @Victor, when to click an Accept at the best Answer provided :o)? – user3666197 Sep 25 '20 at 18:41

2 Answers2

2

Item 3: using pure ZeroMQ built-ins

Fig.1: Why it is wrong to use a naive REQ/REP

               XTRN_RISK_OF_FSA_DEADLOCKED ~ {  NETWORK_LoS
                                         :   || NETWORK_LoM
                                         :   || SIG_KILL( App2 )
                                         :   || ...
                                         :      }
                                         :
[App1]      ![ZeroMQ]                    :    [ZeroMQ]              ![App2] 
code-control! code-control               :    [code-control         ! code-control
+===========!=======================+    :    +=====================!===========+
|           ! ZMQ                   |    :    |              ZMQ    !           |
|           ! REQ-FSA               |    :    |              REP-FSA!           |
|           !+------+BUF> .connect()|    v    |.bind()  +BUF>------+!           |
|           !|W2S   |___|>tcp:>---------[*]-----(tcp:)--|___|W2R   |!           |
|     .send()>-o--->|___|           |         |         |___|-o---->.recv()     |
| ___/      !| ^  | |___|           |         |         |___| ^  | |!      \___ |
| REQ       !| |  v |___|           |         |         |___| |  v |!       REP |
| \___.recv()<----o-|___|           |         |         |___|<---o-<.send()___/ |
|           !|   W2R|___|           |         |         |___|   W2S|!           |
|           !+------<BUF+           |         |         <BUF+------+!           |
|           !                       |         |                     !           |
|           ! ZMQ                   |         |   ZMQ               !           |
|           ! REQ-FSA               |         |   REP-FSA           !           |
~~~~~~~~~~~~~ DEADLOCKED in W2R ~~~~~~~~ * ~~~~~~ DEADLOCKED in W2R ~~~~~~~~~~~~~
|           ! /\/\/\/\/\/\/\/\/\/\/\|         |/\/\/\/\/\/\/\/\/\/\/!           |
|           ! \/\/\/\/\/\/\/\/\/\/\/|         |\/\/\/\/\/\/\/\/\/\/\!           |
+===========!=======================+         +=====================!===========+

Fig.2: How to implement requirement Item 3, using pure ZeroMQ builtins.

App1.PULL.recv( ZMQ.NOBLOCK ) and App1.PULL.poll( 0 ) are obvious

[App1]      ![ZeroMQ]
code-control! code-control           
+===========!=======================+
|           !                       |
|           !+----------+           |         
|     .poll()|   W2R ___|.bind()    |         
| ____.recv()<----o-|___|-(tcp:)--------O     
| PULL      !|      |___|           |   :   
|           !|      |___|           |   :   
|           !|      |___|           |   :   
|           !+------<BUF+           |   :     
|           !                       |   :                           ![App2]
|           !                       |   :     [ZeroMQ]              ! code-control
|           !                       |   :     [code-control         ! once gets started ...
|           !                       |   :     +=====================!===========+
|           !                       |   :     |                     !           |
|           !                       |   :     |         +----------+!           |
|           !                       |   :     |         |___       |!           |
|           !                       |   :     |         |___| <--o-<.send()____ |
|           !                       |   :<<-------<tcp:<|___|   W2S|!      PUSH |
|           !                       |   :    .connect() <BUF+------+!           |
|           !                       |   :     |                     !           |
|           !                       |   :     |                     !           |
+===========!=======================+   :     +=====================!===========+
user3666197
  • 1
  • 6
  • 50
  • 92
1

Though I am not expert but I once implemented a similar platform.

An additional layer of signaling (REQ/REP) from App2->App1 could do that.

Everytime App2 comes online, a msg should be conveyed to App1.

A separate thread in App1 would be able to receive this msg from App2 anytime.

user3666197
  • 1
  • 6
  • 50
  • 92
  • With all due respect, `REQ/REP` is a risky pattern, as it fails to detect, the less to resolve an internal **F**inite-**S**tate-**A**utomata mutual deadlocking. Pieter HINTJENS wrote countless posts about this **un-avoidable** risk always present on intrinsically unreliable transport-class use-cases. – user3666197 Jul 02 '16 at 18:13