23

I would like to read data from a dynamodb stream in python and the alternatives that i have found so far are

  1. Use dynamodb stream low level library functions (as described here): This solution however seems almost impossible to maintain in a production environment, with the application having to maintain the status of shards, etc.

  2. Use KCL library designed for reading Kinesis streams: The python version of the library seems unable to read from a dynamodb stream.

What are the options to successfully process dynamodb streams in python? (links to possible examples would be super helpful)

PS: I have considered using lambda function to process the dynamodb but for this task, I would like to read the stream in an application as it has to interact with other components which cannot be done via a lamda function.

Shay Ashkenazi
  • 467
  • 1
  • 4
  • 11
Ashish
  • 231
  • 2
  • 3
  • 1
    Curious as to which solution you went with? I'm currently facing the same dilemma and am considering implementing my logic in Java using KCL. – Matt Fortier Jun 20 '16 at 08:29
  • 2
    Facing this exact issue as well. Any updates @Ashish ? – Avihoo Mamka May 04 '17 at 12:19
  • Any updates on this? I am considering going the lambda trigger route and having it call a flask server that will handle the stream data. Any thoughts on that? – Peter Tao Nov 28 '18 at 22:54
  • @PeterTao It's a good idea to use lambda with dynamo streams, but I would recommend you to just get the data, and send it via sqs. – Rafael Marques Feb 07 '20 at 13:01

1 Answers1

1

I would still suggest to use lambda. The setup is very easy as well as very robust (it's easy to manage retries,batching, downtimes...)

Then from your lambda invocation you could easily send your data in a convenient way to your existing program (including, but not limited to: SNS, SQS, a custom server webhook, sending the data to a custom pub/sub service you own...etc)

aherve
  • 3,795
  • 6
  • 28
  • 41