3

AWS Lambda Supports Parallelization Factor for Kinesis and DynamoDB Event Sources. But its not supported for MSK. Can we create a reserved concurrency of the Lambda function and would it help to concurrently consume from MSK topic

dvlpr
  • 311
  • 3
  • 17

4 Answers4

1

TL'DR Connecting lambda to kafka cluster using aws::event-source-mapping is limited to the amount of partitions you are having in the topics

I had the experience to setup a poc of

Custom Kafka Cluster Topic (1 Partition) > EventSourceMapping > Lambda

and after opening a discussion with AWS it looks like it is a limitation

Another approach that I didn't try is to setup a lambda sink (kafka connect) and setup a tasks.max which seems like it can solve this issue https://docs.confluent.io/kafka-connectors/aws-lambda/current/overview.html#lambda-sink-multiple-tasks

Omer Shacham
  • 618
  • 4
  • 11
  • 1
    but can we force a lambda to run 1 consumer per partition instead of sometimes running one consumer form many partitions and, thus, having a lower throughtput? – pbsb Nov 26 '22 at 04:25
  • 1
    @pbsb Good question. How do you configure the lambda to spin up a new instance for each topic partition? Mine is just spinning up one consumer for the whole topic (6 partitions) – Arran Duff Jul 11 '23 at 12:29
0

The specifics in documentation is pretty sparse. I also was looking for this, the only thing I've found is from this: https://amazonmsk-labs.workshop.aws/en/msklambda/tpschemareg/overview.html

In it they read from MSK and post to Kinesis so that lambda can process in parallel. It seems like the MSK event source is there mainly for migration if true. Only one consumer is pretty limiting.

Maybe someone who experimented more can leave a better answer.

krishwin's comment at the bottom of this article also seems to indicate this. https://dev.to/danieljameskay/triggering-lambda-functions-from-amazon-msk-316o

A better option might be a AWS lambda sink connector. It looks like it will run a lambda process up to number of partitions:

Russell
  • 26
  • 3
-1

You can set ParallelApplyThreads as more than 1 in the TargetMetadata on the task settings.

Check this document.

hey
  • 137
  • 1
  • 5
-1

Lambda has auto scaling to control the concurrency. You usually do not need set the concurrency unless you have specific need. https://aws.amazon.com/about-aws/whats-new/2022/01/aws-lambda-auto-scaling-msk-apache-kafka/

  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Mar 02 '22 at 08:08