How to evenly balance processing many simultaneous tasks?

Question

PROBLEM

Our PROCESSING SERVICE is serving UI, API, and internal clients and listening for commands from Kafka. Few API clients might create a lot of generation tasks (one task is N messages) in a short time. With Kafka, we can't control commands distribution, because each command comes to the partition which is consumed by one processing instance (aka worker). Thus, UI requests could be waiting too long while API requests are processing.

In an ideal implementation, we should handle all tasks evenly, regardless of its size. The capacity of the processing service is distributed among all active tasks. And even if the cluster is heavily loaded, we always understand that the new task that has arrived will be able to start processing almost immediately, at least before the processing of all other tasks ends.

SOLUTION

Instead, we want an architecture that looks more like the following diagram, where we have separate queues per combination of customer and endpoint. This architecture gives us much better isolation, as well as the ability to dynamically adjust throughput on a per-customer basis. On the side of the producer

the task comes from the client
immediately create a queue for this task
send all messages to this queue

On the side of the consumer

in one process, you constantly update the list of queues
in other processes, you follow this list and consume for example 1 message from each queue
scale consumers

QUESTION

Is there any common solution to such a problem? Using RabbitMQ or any other tooling. Нistorically, we use Kafka on the project, so if there is any approach using - it is amazing, but we can use any technology for the solution.

Try reading about Apache Pulsar. It has various advantages over Kafka one of them being automatic load balancing. Read : https://dzone.com/articles/5-more-reasons-to-choose-apache-pulsar-over-kafka#:~:text=Pulsar%20does%20broker%20load%20balancing%20automatically%20for%20you.&text=usage%20of%20brokers%20will%20move,broker%20load%20balancing%20with%20Kafka. — aru_sha4, Jun 16 '20 at 06:18

score 2 · Answer 1 · answered Jun 16 '20 at 18:32

Why not use spark to execute the messages within the task? What I'm thinking is that each worker creates a spark context that then parallelizes the messages. The function that is mapped can be based on which kafka topic the user is consuming. I suspect however your queues might have tasks that contained a mixture of messages, UI, API calls, etc. This will result in a more complex mapping function. If you're not using a standalone cluster and are using YARN or something similar you can change the queueing method that the spark master is using.

score 0 · Answer 2 · answered Jun 11 '20 at 08:54

As I understood the problem, you want to create request isolation from the customer using dynamically allocated queues which will allow each customer tasks to be executed independently. The problem looks like similar to Head of line blocking issue in networking

The dynamically allocating queues is difficult. This can also lead to explosion of number of queues that can be a burden to the infrastructure. Also, some queues could be empty or very less load. RabbitMQ won't help here, it is a queue with different protocol than kafka.

One alternative is to use custom partitioner in kafka that can look at the partition load and based on that load balance the tasks. This works if the tasks are independent in nature and there is no state store maintains in the worker.

The other alternative would be to load balance at the customer level. In this case you select a dedicated set of predefined queues for a set of customers. Customers with certain Ids will be getting served by a set of queues. The downside of this is some queues can have less load than others. This solution is similar to Virtual Output Queuing in networking,

score 0 · Answer 3 · answered Jun 11 '20 at 09:04

My understanding is that the partitioning of the messages it's not ensuring a evenly load-balance. I think that you should avoid create overengineering and so some custom stuff that will come on top of the Kafka partitioner and instead think at a good partitioning key that will allows you to use Kafka in an efficiently manner.

How to evenly balance processing many simultaneous tasks?

PROBLEM

SOLUTION

QUESTION

3 Answers3