I have a use case where I need to add ttl column to the existing table. Currently, this table has more than 2 billion records.
Is there any existing solution build around same? Or Should be emr is the path forward?
I have a use case where I need to add ttl column to the existing table. Currently, this table has more than 2 billion records.
Is there any existing solution build around same? Or Should be emr is the path forward?
DynamoDB does not support update operations that span primary key boundary. And for reading data, the only operation that spans partition boundaries is a Scan.
So, unfortunately the only way to add an attribute (DynamoDB is a document database, so there is no such concept as columns) to all items in a table is to actually execute a Put for each item.
If your table has about 2 billion items in it that will be 2 billion writes.
Of course, you can use EMR with Hive to connect to the table and execute a SQL-style update to add the TTL attribute but it will still translate into 2 billion individual PutItem requests so it will either take a while or be quite costly to run.
If the reason you would like to add the TTL is because you are trying to delete a significant number of the items in the table, perhaps a better aproach would be to creat a new table, copy the records you need there with TTl and all, and then delete the old table.
It is not directly supported, but you are in luck that this AWS blog post was recently published that covers the process in depth.