Apache Storm Message Passing Implementation (MPI)

Question

According to the MPI implementation of Storm the workers manage connections to other workers and maintain a mapping from task to task. Also, transferring takes in a task id and a tuple, and it serializes the tuple and puts it onto a "transfer queue”.

The question is, if there is a way to organise scheduling, such that certain tasks of an operator communicate to only certain tasks of the following operator at a given time according to the application’s topology (could ZeroMQ possibly do something like this?).

score 0 · Answer 1 · edited Dec 10 '20 at 22:04

Q : "If there is a way to organise scheduling, such that certain tasks of an operator communicate to only certain tasks of the following operator at a given time according to the application’s topology ( could ZeroMQ possibly do something like this? )."

Obviously could,
it does allow smart & flexible creation of signalling/messaging meta-plane(s) infrastructure(s) for the distributed-computing, improving itself in doing this for about the last 12+ years.

The @HristoIlliev attached comment's URL details that Apache-Storm itself reports to already use the ZeroMQ-layer for its own services _{*[in ver.0.8.0, almost all implementation (source-code) links unfortunately already dead there]}:

The implementation for distributed mode uses ZeroMQ code

The implementation for local mode uses in-memory Java queues (so that it's easy to use Storm locally without needing to get ZeroMQ installed) code ...

Tasks listen on an in-memory ZeroMQ port for messages from the virtual port code

So the topology-related part of your question is related to the decision already made on this subject in the "outer" Apache-Storm architecture, that was done.

Tasks are responsible for message routing. A tuple is emitted either to a direct stream (where the task id is specified) or a regular stream. In direct streams, the message is only sent if that bolt subscribes to that direct stream. In regular streams, the stream grouping functions are used to determine the task ids to send the tuple to.

The MPI does the same for the HPC-focused computing ecosphere, since FORTRAN jobs started to run on first HPC distributed infrastructures. Due to the most of the HPC-computing problems were "simply" scaled onto larger footprints of the computing hardware, the MPI focus was more on efficiency of such uniform scaling, not visiting thus the opposite corner of adaptive, almost ad-hoc setup of message-passing infrastructure, layered topologies of specialised ZeroMQ Scalable Formal Communication Archetypes Patterns, so each of the tools focus on other factors.

If you feel you want to read a bit more on ZeroMQ, this answer might help to fast understand the core underlying concepts.

I think MPI in the question is not MPI you are writing about. — Hristo Iliev, Nov 09 '20 at 20:20
By the way, since you've mentioned it, MPI did not appear with the first distributed memory computers. It is a much later effort to unify the various incompatible vendor message passing libraries in order to enable source code portability. Other similar efforts exist, such as PVM which predates MPI by about five years. PVM unfortunately died c. 2009, but it supports fault tolerance (FT), which is becoming very important as HPC systems are growing ever larger and moving into the relatively unreliable cloud. MPI still doesn't have FT. — Hristo Iliev, Nov 09 '20 at 20:53
I remember the such wild tools as Rob Pike's Plan9 / Inferno and PVM starts :o) THe time is so so fast. The recent "cloudy" direction of the HPC actually quite fascinates me due to awfully (uncomparably) large / uncontrolled latencies and hard to manage ( if managed at all ) transfer-channel capacities, load-prioritisation, load-balancing and topologies. Sure, wider & cheaper "resources" might seem attractive for growing HPC-loads, yet the costs of uncertainty and costs of (performance motivated already super-optimised) code re-engineering for cloudy future seems to be rather costly, isn't it? — user3666197, Nov 09 '20 at 21:06
That's why alternatives to MPI (such as DAGs and stream processing) are growing in popularity while MPI is staying entrenched in its very niche segment of HPC. — Hristo Iliev, Nov 09 '20 at 21:09

Apache Storm Message Passing Implementation (MPI)

1 Answers1