1

Is it possible to delete items from a DynamoDB table without specifying partition or sort keys? I have numerous entries in a table with different partition and sort keys and I want to delete all the items where a certain attribute does not exist.

AWS CLI or boto3/python solutions are welcome.

rosstripi
  • 573
  • 1
  • 9
  • 19

2 Answers2

0

To delete large number of items from the table you need to query or scan first and then delete the items using BatchWriteItem or DeleteItem operation.

Query and BatchWriteItem is better interms of performance and cost, so if this is a job that happens frequently, its better to add a global secondary index on the attribute you need to check for deletion. However you need to manage BatchWriteItem iteratively for large number of items since query will return paginated values.

Else you can do a scan and DeleteItem iteratively.

Check this Stackoverflow question for more insight.

Ashan
  • 18,898
  • 4
  • 47
  • 67
0

It worth to try to use EMR Hive integration with DynamoDB. It allows you to write SQL queries against a DynamoDB. Hive supports DELETE statement and Amazon have implemented a DynamoDB connector. I am not sure if this would integrate perfectly, but this worth a try. Here is how to work with DynamoDB using EMR Hive.

Another option is to use parallel scan. Just get all items from DynamoDB that match a filter expression, and delete each one of them. Here is how to do scans using boto client.

To speed up the process you can batch delete items using the BatchWriteItem method. Here is how to do this in boto.

Notice that BatchWriteItem has following limitations:

BatchWriteItem can write up to 16 MB of data, which can comprise as many as 25 put or delete requests.

Keep in mind that scans are expensive when you are doing scans you consume RCU for all items DynamoDB reads in your table and not for items it returns. So you either need to read data slowly or provision very high RCU for a table.

It's ok to do this operation infrequently, but you can't do it as a part of a web-server request if you have a table of a decent size.

Ivan Mushketyk
  • 8,107
  • 7
  • 50
  • 67