Should all helper methods be an actor of their own? How do I make the separation?
In addressing these questions, it's helpful to think about what an actor is. Derek Wyatt, in his book Akka Concurrency, draws an analogy between an actor and a person (the following is an excerpt from the freely available fourth chapter of his book):
Your day-to-day world is full of concurrency. You impose it on yourself
as well as the people around you, and they impose it on you. The real-world
equivalents of critical sections and locks as well as synchronized methods
and data are all naturally handled by yourself and the people in your world.
People manage this by literally doing only one thing at a time. We like to
pretend that we can multi-task, but it’s simply not true. Anything meaningful
that we do requires that we do just that one thing. We can pause that task
and resume it later, switch it out for something else to work on and then
return to it, but actually doing more than one thing at a time just isn’t in our wheelhouse.
So what if we want to do more than one thing at a time? The answer is
pretty obvious: we just use more than one person. There’s not much in the
world that we’ve benefited from that wasn’t created by a gaggle of talented
people.
This is why actors make our application development more intuitive and
our application designs easier to reason about: they’re modeled after our
day-to-day lives.
And later in the same chapter, he writes:
Actors only do one thing at a time; that’s the model of concurrency. If you
want to have more than one thing happen simultaneously, then you need to
create more than one actor to do that work. This makes pretty good sense,
right? We’ve been saying all along that actor programming draws a lot on
your day-to-day life experiences. If you want work done faster, put more
people on the job.
In designing an actor system, it's a good principle to assign the major parts of the system to actors that have distinct responsibilities. For example, in an ETL pipeline, one might have:
- An actor that subscribes to a queue for new data.
- An actor that parses the raw data.
- An actor that filters parsed data for pieces of information in which the users are interested.
- An actor that saves the results to a database.
- An actor that publishes the results to a queue to which users can subscribe.
Let's say that the parser actor uses a helper method. Whether this method should instead be encapsulated in its own actor depends on what the method does. It's a good idea to break down tasks into sub-tasks, but at some point you have to decide that a task is too "small"--too small to be its own actor.
Returning to the person analogy, let's visualize each stage in the pipeline as a person. Let's assume that the parser person needs a sharpened pencil to do his work and that currently the parser gets the pencil himself. Would it make sense to hire a lowly intern (borrowing another illustration from Wyatt's book) whose sole job is to sharpen a pencil and give it to the parser person when requested? If a large inflow of data came in, would it suffice to hire more parser persons without hiring any interns? It probably wouldn't be efficient to have 30 pencil-sharpening interns for ten parsers. In other words, we probably wouldn't need to independently scale the number of interns in relation to the number of parsers. If we did need to do so, that would signal that the intern's job is significant enough to justify hiring the intern.
To summarize this subjective food for thought:
- Decompose your system into parts with clear and distinct responsibilities. Assign each part to an actor.
- If necessary, decompose each part into sub-tasks. Assign each sub-task to an actor. A key way to help decide whether further decomposition is necessary is to determine whether it would be productive to independently scale the sub-task.
- Functionality that is too small to be its own actor can be placed inside helper methods, if you want to.
Where should the helper methods be?
A couple of ideas:
- If a helper method is relevant only to a certain actor, declare it inside the actor as you're doing.
- If a helper method could be used in different kinds of actors, declare it in a utility class. Alternatively, declare it as a static method within an actor.