AWS Lambda Supports Parallelization Factor for Kinesis and DynamoDB Event Sources. But its not supported for MSK. Can we create a reserved concurrency of the Lambda function and would it help to concurrently consume from MSK topic
4 Answers
TL'DR Connecting lambda to kafka cluster using aws::event-source-mapping is limited to the amount of partitions you are having in the topics
I had the experience to setup a poc of
Custom Kafka Cluster Topic (1 Partition) > EventSourceMapping > Lambda
and after opening a discussion with AWS it looks like it is a limitation
Another approach that I didn't try is to setup a lambda sink (kafka connect) and setup a tasks.max
which seems like it can solve this issue
https://docs.confluent.io/kafka-connectors/aws-lambda/current/overview.html#lambda-sink-multiple-tasks

- 618
- 4
- 11
-
1but can we force a lambda to run 1 consumer per partition instead of sometimes running one consumer form many partitions and, thus, having a lower throughtput? – pbsb Nov 26 '22 at 04:25
-
1@pbsb Good question. How do you configure the lambda to spin up a new instance for each topic partition? Mine is just spinning up one consumer for the whole topic (6 partitions) – Arran Duff Jul 11 '23 at 12:29
The specifics in documentation is pretty sparse. I also was looking for this, the only thing I've found is from this: https://amazonmsk-labs.workshop.aws/en/msklambda/tpschemareg/overview.html
In it they read from MSK and post to Kinesis so that lambda can process in parallel. It seems like the MSK event source is there mainly for migration if true. Only one consumer is pretty limiting.
Maybe someone who experimented more can leave a better answer.
krishwin's comment at the bottom of this article also seems to indicate this. https://dev.to/danieljameskay/triggering-lambda-functions-from-amazon-msk-316o
A better option might be a AWS lambda sink connector. It looks like it will run a lambda process up to number of partitions:

- 26
- 3
Lambda has auto scaling to control the concurrency. You usually do not need set the concurrency unless you have specific need. https://aws.amazon.com/about-aws/whats-new/2022/01/aws-lambda-auto-scaling-msk-apache-kafka/

- 9
- 1
-
Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Mar 02 '22 at 08:08