13

I am having a look at the pipes 3.0 package for stream processing. The tutorial is very well done and very clear, except that I cannot wrap my head around the "zip and merge" section.

My goal is to combine pipes a bit like ArrowChoice allows to do:

  • I have a unique producer of Either a a
  • I would like to apply a first pipe to Left values and another one to Right values
  • I would then like to merge the results, and continue piping


+----------+                   +------+ - filterLeft ->  pipe1 -> +------------+ 
| producer | - (Either a a) -> | fork |                           | mergeD (?) |
+----------+                   +------+ - filterRight -> pipe2 -> +------------+

I define fork like in the tutorial:

fork () = 
    runIdentityP . hoist (runIdentityP . hoist runIdentityP) $ forever $ do
        a <- request ()
        lift $ respond a
        lift $ lift $ respond a

oddOrEven x = if odd x then Left x else Right x
producer = fromListS [1..0] >-> mapD oddOrEven
isLeft (Left _) = True
isLeft (Right _) = False
isRight = not . isLeft
filterLeft = filterD isLeft
filterRight = filterD isRight
pipe1 = mapD (\x -> ("seen on left", x))
pipe2 = mapD (\x -> ("seen on right", x))

p1 = producer >-> fork    

The problem is that I cannot make the types right. The tutorial seems only to show how to run the inner (lifted) pipe chain as a self contained session, but I would like to be able to reinject its values to the pipe, not just apply an effect on them. I of course tried to follow the types, but they get a bit hairy very quickly.

Does Anyone can help me on this ? Thanks in advance.

(PS: an example of this kind of topology would be a nice addition to the tutorial, or even better a section on how to emulate the Control.Arrow stuff using pipes)

Davorak
  • 7,362
  • 1
  • 38
  • 48
LeMiz
  • 5,554
  • 5
  • 28
  • 23
  • Hm, not enough time right now to answer fully, but here's some food for thought: http://hpaste.org/80426 – Dan Burton Jan 07 '13 at 14:17
  • Are you looking for bidirectionally along both branches or is this strictly a one way flow left to right? – Davorak Jan 07 '13 at 17:34
  • @Dan Burton, thanks for the pastie. I am not really at ease with Kleisli stuff, and that was in fact the reason I am looking at pipes ! everything looked much simpler with it ! – LeMiz Jan 07 '13 at 20:50
  • @Davorak, I do not need bidirectionnality at all, the flow is a direct acyclic graph (essentially, I am just trying to build stateful filters) – LeMiz Jan 07 '13 at 20:52

1 Answers1

15

The pipe abstraction does not support diamond topologies or any form of Arrow-like behavior. This is not an API issue, but rather there is no correct or well-defined behavior for such a scenario.

To explain why, allow me to simplify your diagram to the following one:

          +----+
          | pL |
+----+ => +----+ => +----+
| p1 |              | p2 |
+----+ => +----+ => +----+
          | pR |
          +----+

Imagine we are at the p1 pipe and we respond to pL. If you remember the tutorial, the proxy laws require that every respond blocks until upstream. That means that p1 cannot regain control until pL requests again. So at this point we have:

  • p1 blocked waiting for a request from pL

However, suppose that pL does not request yet and instead responds with a value of its own to p2. So now we have:

  • p1 blocked waiting for a request from pL
  • pL blocked waiting for a request from p2

Now suppose that p2 instead requests from pR. The proxy laws say that p2 cannot regain control until pR responds again. Now we have:

  • p1 blocked waiting for a request from pL
  • pL blocked waiting for a request from p2
  • p2 blocked waiting for a respond from pR

Now what happens when pR requests a value from p1? If we consult our list of blocks, p1 is still blocked waiting for a request from pL, so it is in no shape to receive a request from pR. There is no correct way to "tie the knot", so to speak, even if pL and pR shared the same request signature.

More generally, the proxy laws ensure the following two invariants:

  • Every pipe "upstream" of the active pipe will be blocked on a respond
  • Every pipe "downstream" of the acive pipe will be blocked on a request

Cycles or diamonds break these invariants. This is why the tutorial very briefly remarks in passing that cyclic topologies do not "make sense".

You can see why diamonds break this invariant in the example I just gave you. When p1 had control it was upstream of pR, which would imply pR was blocked on a request. However, when p2 gained control it was downstream of pR, which would imply pR was blocked on a respond. This leads to a contradiction, because pR couldn't have changed yet since control flowed through pL and not pR to get to p2.

Machines

So there are two solutions to your problem. one solution is to just inline your desired splitting behavior into a single pipe. You define a pE pipe that combines the behavior of pL and pR into a single pipe.

The more elegant solution to this problem is something in the style of Edward's machines. You define a more restricted abstraction that is less powerful than proxies that supports ArrowChoice, you do your arrow-ish stuff within the domain of that abstraction, and then when you are done you upgrade it to proxies.

If you squint, you could pretend that there is a category of currently available coroutine abstractions in Haskell that is a partial order. Coroutines abstractions are the objects, and an arrow from coroutine abstraction C1 to coroutine abstraction C2 means that you can embed coroutines of type C1 in coroutines of type C2 (i.e. C1 is an improper subset of C2).

In this partial order, proxies would probably be the terminal object, meaning that you can think of proxies as the assembly language of coroutines. Following the analogy of assembly language, proxies provide less guarantees, but you can embed more restrictive coroutine abstractions (i.e. higher-level languages) within proxies. These higher-level languages provide greater restrictions which enables more powerful abstractions (i.e. an Arrow instance).

If you want a trivial example of this, consider one of the simplest coroutine abstractions: the Kleisli arrow:

newtype Kleisli m a b = Kleisli { runKleisli :: a -> m b }

instance Category (Kleisli m) where
    id = Kleisli return
    (Kleisli f) . (Kleisli g) = Kleisli (f <=< g)

Kleisli arrows are definitely more restrictive than proxies, but because of this restriction they support an Arrow instance. So whenever you need an Arrow instance you write your code using Kleisli arrows, and combine it using Arrow notation, and then when you are done, you can "compile" that higher-level Kleisli code to the proxy assembly code using mapMD:

kleisliToProxy :: (Proxy p) => Kleisli m a b -> () -> Pipe p a b m r
kleisliToProxy (Kleisli f) = mapMD f

This compilation obeys the functor laws:

kleisliToProxy id = idT

kleisliToProxy (f . g) = kleisliToProxy f <-< kleisliToProxy g

So if your branching code can be written in terms of Kleisli arrows, then use Kleisli arrows for that section of the code and then compile it down to proxies when you are done. Using this trick, you can compile multiple coroutine abstractions down to the proxy abstraction to mix them.

Gabriella Gonzalez
  • 34,863
  • 3
  • 77
  • 135
  • Thanks for the great and visual explanation. I remarqued the "en passant" remark that cycles were impossible but I did not realized forking created a loop because of directionality, because I wrongly believed that typing as pipes would enforce mono directionallity. – LeMiz Jan 07 '13 at 20:53
  • On the machines part, my primary concern is about being able to reuse / compose the components as much as possible, because for the applications I am considering (statistical estimators on real time data), the filters can get very complicated quite quickly. I now realise that my dream would be that I would be able to say: "this is a pipe !" "this is a proxy !" and so on, that everything would compose nicely and that the type system would prove *for me* that there is no cycle in the flow graph... – LeMiz Jan 07 '13 at 21:06
  • @LeMiz I understand. When I wrote the proxy type I made many sub-interfaces (producers, pipes, etc.) work so that you wouldn't have to define conversion functions. However, there are some interfaces that require even a great restriction of the proxy type that cannot be accomplished simply by choosing the right type parameters. The issue has nothing to do with bidirectionality as even monodirectional pipes have the same problem. Rather, the issue is that the flow of control can never loop on itself. It may only retrace its own steps. – Gabriella Gonzalez Jan 07 '13 at 21:13
  • I understand, and writing the forking parts of the flow with Kleisli arrows would be quite acceptable. But let me play the Devil's advocate for 1 minute: why shouldn't I use this for every part of the flow if I don't need bidirectionality (except for taking profit of the nice functions of the API, and most important of all, of the clear semantics and language ?) – LeMiz Jan 07 '13 at 21:23
  • 1
    On the contrary, if you can write it completely in terms of Kleisli arrows you should! It implies that you don't need the full power of the proxy abstraction. If I may make an analogy: why use a monadic function when an ordinary function would suffice? The same applies here. Use the smallest abstraction that fits your problem. As nice as it is to reason about proxies, it is even easier to reason about Kleisli arrows, so I would actually encourage to write as much as your logic in Kleisli arrows as possible. – Gabriella Gonzalez Jan 07 '13 at 21:26
  • @GabrielGonzalez It's not actually true that this limitation exists with monodirectional pipes as well. See my blog post: http://paolocapriotti.com/blog/2012/02/04/monoidal-instances-for-pipes/ and my implementation of "multi-channel" pipes in pipes-core. – Paolo Capriotti Jan 15 '13 at 01:53
  • @PaoloCapriotti I remember that post. I'll revisit again, but I vaguely remember that I had difficulty proving the coproduct version of the arrow laws. – Gabriella Gonzalez Jan 15 '13 at 03:49
  • 3
    Gabriel has since revised his opinion and support for diamond topologies are offer in [pipes-3.2.0 and up through use of arrow-choice](http://www.haskellforall.com/2013/03/pipes-32-listt-codensity-arrowchoice.html). An example can be found in a question on [haskell-pipes-and-branching](http://stackoverflow.com/a/16206098/128583) – Davorak Apr 29 '13 at 18:31
  • Yeah, I saw that, and I am happy. I think that it would be useful to update this for the simpler API of pipes-4.0. For the record, I am trying to use pipes in the context of the random monad. My goal is to have a nice combinatory library for stochastic processeS. – LeMiz Sep 09 '13 at 07:16
  • @LeMiz I will eventually revise this. The reason I haven't done so yet is that I've actually discovered that push-based pipes seem to form both an `Arrow` and `ArrowChoice`. However, I'm trying to prove that they satisfy the relevant laws, first. – Gabriella Gonzalez Sep 10 '13 at 03:36
  • Do you mean that the push category would be an arrow but not the pull one ? Is there not a nice symmetry between the two ? anyway, keep up the great work ! – LeMiz Sep 10 '13 at 06:47
  • They are symmetric, but when you reflect a push-based `Pipe` you get a pull-based `CoPipe`. That's why they are not easily interchangeable. – Gabriella Gonzalez Sep 14 '13 at 02:44