1

Hi I am working with akka streams along with akka-stream-kafka. I am setting up a Stream with the below setup:

Source (Kafka) --> | Akka Actor Flow | --> Sink (MongoDB)

Actor Flow basically by Actors that will process data, below is the hierarchy:

                                      System
                                         | 
                                     Master Actor  
                                      /       \
                          URLTypeHandler     SerializedTypeHandler
                             /       \                   |
                     Type1Handler   Type2Handler     SomeOtherHandler

So Kafka has the message, I write up the consumer and run it in atMostOnceSource configuration and use

Consumer.Control control =
            Consumer.atMostOnceSource(consumerSettings, Subscriptions.topics(TOPIC))
                    .mapAsyncUnordered(10, record -> processAccessLog(rootHandler, record.value()))
                    .to(Sink.foreach(it -> System.out.println("FinalReturnedString--> " + it)))
                    .run(materializer);

I've used a print as a sink initially, just to get the flow running.

and the processAccessLog is defined as:

private static CompletionStage<String> processAccessLog(ActorRef handler, byte[] value) {

    handler.tell(value, ActorRef.noSender());

    return CompletableFuture.completedFuture("");
}

Now, from the definition ask must be used when an actor is expecting a response, makes sense in this case since I want to return values to be written in the sink.

But everyone (including docs), mention to avoid ask and rather use tell and forward, an amazing blog is written on it Don't Ask, Tell.

In the blog he mentions, in case of nested actors, use tell for the first message and then use forward for the message to reach the destination and then after processing directly send the message back to the root actor.

from the blog

Now here is the problem,

  1. How do I send the message from D back to A, such that I can still use the sink.
  2. Is it good practice to have open ended streams? e.g. Streams where Sink doesn't matter because the actors have already done the job. (I don't think it is recommend to do so, seems flawed).
Ramón J Romero y Vigil
  • 17,373
  • 7
  • 77
  • 125
iam.Carrot
  • 4,976
  • 2
  • 24
  • 71

2 Answers2

2

ask is Still the Right Pattern

From the linked blog article, one "drawback" of ask is:

blocking an actor itself, which cannot pick any new messages until the response arrives and processing finishes.

However, in akka-stream this is the exact feature we are looking for, a.k.a. "back-pressure". If the Flow or Sink are taking a long time to process data then we want the Source to slow down.

As a side note, I think the claim in the blog post that the additional listener Actor results in an implementation that is "dozens times heavier" is an exaggeration. Obviously an intermediate Actor adds some latency overhead but not 12x more.

Elimination of Back-Pressure

Any implementation of what you are looking for would effectively eliminate back-pressure. An intermediate Flow that only used tell would continuously propagate demand back to the Source regardless of whether or not your processing logic, within the handler Actors, was completing its calculations at the same speed that the Source is generating data. Consider an extreme example: what if your Source could produce 1 million messages per second but the Actor receiving those messages via tell could only process 1 message per second. What would happen to that Actor's mailbox?

By using the ask pattern in an intermediate Flow you are purposefully linking the speed of the handlers and the speed with which your Source produces data.

If you are willing to remove back-pressure signaling, from the Sink to the Source, then you might as well not use akka-stream in the first place. You can have either back-pressure or non-blocking messaging, but not both.

Ramón J Romero y Vigil
  • 17,373
  • 7
  • 77
  • 125
  • Thanks for the great explanation, what I ended up doing was I processed my stream using `.mapAsyncUnordered(10, record -> ask(ActorA, record.value()`, followed with `forward` from `ActorA -> ActorB -> ActorC` and ActorC replies to stream using `getSender().tell(myResponse, ActorRef.noSender());`.And then leverage `Sink` for all outward data store That should still propagate back pressure right? (as per my understanding) – iam.Carrot Nov 20 '18 at 19:11
  • @iam.Carrot You are welcome. As long as there is an ask involved then yes, back pressure is propagated. – Ramón J Romero y Vigil Nov 20 '18 at 20:51
  • just a quick thing, is it relevant to perform a `ActorRef.ask()` at `Sink` as well if you have a whole actor system to handle `Sink`? or would it provide backpressure even if we have `tell` – iam.Carrot Nov 23 '18 at 08:49
1

Ramon J Romero y Vigil is right but I will try to extend the response.

1) I think that the "Don't ask, tell" dogma is mostly for Actor systems architecture. Here you need to return a Future so the stream can resolve the processed result, you have two options:

  • Use ask
  • Create an actor per event and pass them Promise so a Future will be complete when this actor receives the data (you can use the getSender method so D can send the response to A). There is no way to send a Promise or Future in a message (The are not Serialisable) so the creation of this short living actors can not be avoided.

At the end you are doing mostly the same...

2) It's perfectly fine to use an empty Sink to finalise the stream (indeed akka provides the Sink.ignore() method to do so).

Seems like you are missing the reason why you are using streams, they are cool abstraction to provide composability, concurrency and back pressure. In the other hand, actors can not be compose and is hard to handle back pressure. If you don't need this features and your actors can have the work done easily you shouldn't use akka-streams in first place.

RoberMP
  • 1,306
  • 11
  • 22