8

I have already read some questions about kinesis shard and multiple consumers but I still don't understand how it works.

My use case: I have a kinesis stream with just one shard. I would like to consume this shard using different lambda function, each of them independently. It's like that each lambda function will have it's own shard iterator.

Is it possible? Set multiple lambda consumers ( stream based) reading from the same stream/shard?

p.magalhaes
  • 7,595
  • 10
  • 53
  • 108

5 Answers5

6

Hey Mr Magalhaes I believe the following picture should answer some of your questions.

Processing Streams: Lambda

So to clarify you can set multiple lambdas as consumers on a kinesis stream, but the Lambdas will block each other on processing. If your stream has only one shard it will only have one concurrent Lambda.

David Webster
  • 2,208
  • 1
  • 16
  • 27
4

If you have one kinesis stream, you can connect as many lambda functions as you want through an event source mapping.

All functions will run simultaneously and fully independent of each other and will constantly be invoked if new records arrive in the stream. The number of shards does not matter.

oneschilling
  • 463
  • 5
  • 11
3

For a single lambda function: "For Lambda functions that process Kinesis or DynamoDB streams the number of shards is the unit of concurrency. If your stream has 100 active shards, there will be at most 100 Lambda function invocations running concurrently. This is because Lambda processes each shard’s events in sequence." [https://docs.aws.amazon.com/lambda/latest/dg/scaling.html]

But there is no limit on how many different lambda consumers you want to attach with kinesis.

flare
  • 344
  • 2
  • 6
2

Short answer:

Yes it will work, and will work concurrently.

Long answer:

Each shared in Kinesis stream has 2MiB/sec read throughput: https://docs.aws.amazon.com/streams/latest/dev/building-consumers.html

If you have multiple applications (in your case Lambda's). They will share the throughput. A description taken from the link above:

Fixed at a total of 2 MiB/sec per shard. If there are multiple consumers reading from the same shard, they all share this throughput. The sum of the throughput they receive from the shard doesn't exceed 2 MiB/sec.

If you create (write) less than 1mib/sec of data you should be able to support two "applications" with a single shard.

In general if you have Y shards and X applications it should work properly assuming your total write throughput (mib/sec) is less than 2mib/sec * Y / X and that data is spread equally between shards.

If you require each "Application" to use 2 Mib/sec each, you may enable "Consumers with Enhanced Fan-Out" which "fan-outs" the stream giving each application a dedicated 2 Mib/sec per shard (instead of sharing the throughput).

This is described in the following link: https://docs.aws.amazon.com/streams/latest/dev/introduction-to-enhanced-consumers.html

In Amazon Kinesis Data Streams, you can build consumers that use a feature called enhanced fan-out. This feature enables consumers to receive records from a stream with throughput of up to 2 MiB of data per second per shard. This throughput is dedicated, which means that consumers that use enhanced fan-out don't have to contend with other consumers that are receiving data from the stream.

Tomer
  • 1,594
  • 14
  • 15
1

Yes, no problem with this !

The number of shards doesn't limit the number of consumers a stream can have. In you case, it will just limit the number of concurrent invocations of each lambda. This means that for each consumers, you can only have the number of shards of concurrent executions.

Seethis doc for more details.

DaMaill
  • 875
  • 8
  • 17
  • 2
    It's still doesn't make sense for me. If I have just one shard and multiple functions consuming from the same shard I cannot have them running concurrently, right ? So when a new record arrive on a stream, the lambda will be called one after another. – p.magalhaes Apr 14 '17 at 14:10
  • 1
    @p.magalhaes yes you are right. Basically, if you have only one shard and you will not have parallel working Lambda functions for the same consumer. You have to consume the shard fast enough, you can not have multiple threads consuming the same shard for the same consumer. See this as an example http://stackoverflow.com/a/34509567/2836435 . Same applies for AWS Lambda with Kinesis. – SerhatCan Apr 16 '17 at 13:48
  • 3
    I think that this is wrong. With 1 shard and 2 different consumers, you will have the 2 consumers invoked and running at the same time when a new record arrive. I tried, to be sure, with 2 functions that last 100s when a message arrive, and the 2 functions run simultaneously – DaMaill Apr 19 '17 at 14:09