1

I am trying to delete multiple records in DynamoDB table using the partition key and sort key. The approach is taking lot of time to delete the records. I am following below approach implemented using java aws sdk library.

 public static ItemCollection<QueryOutcome> getKeyDataFromTable(String tableName, HashMap<String, String> keyValue){
        QuerySpec spec = null;
        //Connect to Database Table
        spec = new QuerySpec().withKeyConditionExpression("pKey" =:v_id").withValueMap(new ValueMap().withString(":v_id",keyValue.get("partitionkey")));
        ItemCollection<QueryOutcome> items = table.query(spec);
        } 



 public void deleteItems(){
    String tableName="tableName";
    String partitionKey="partitionKey";
    HashMap<String, String> keyValue = new HashMap<>();
    keyValue.put("partitionKey",partitionKey);
    ItemCollection<QueryOutCome> keyDataFromTable = getKeyDataFromTable(tableName,keyValue);
    IteratorSupport<QueryOutcome> iterator = keyDataFromTable.iterator();
    Item item=null;
    List<String> sortKeyList = new ArrayList<>();
    while(iterator.hasNext()){
    item=iterator.next();
    sortKeyList.add((String) item.get("sortKey"));
    }
    for(String skey:ortKeyList){
    DeleteItemSpec deleteItemSpec = new DeleteItemSpec().withPrimaryKey("partitionKey","partitionKey","sortKey","sortKey");
    table.deleteItem(deleteItemSpec);
    
    }
    }

Is there any right approach of doing this dynamically with better performance?

NoSQLKnowHow
  • 4,449
  • 23
  • 35
user2401547
  • 131
  • 1
  • 10
  • 1
    Use [batchWriteItem](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/batch-operation-document-api-java.html#JavaDocumentAPIBatchWrite) to delete multiple items in a single API call – jarmod May 06 '22 at 23:28
  • @jarmod: I am confused of how to write batchWriteItem for this requirement. Could you please help me with an example for the requirement above? – user2401547 May 06 '22 at 23:31
  • 1
    I would search for examples (see [here](https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/java_dynamodb_code_examples.html) or at the earlier link) or just try to write the code. – jarmod May 06 '22 at 23:38
  • And please format your code while you're at it – Ermiya Eskandary May 07 '22 at 00:11

1 Answers1

2

Unfortunately, DynamoDB does not have an API for deleting an entire partition (i.e., all the items which share a specific partition key). I'm saying "unfortunately" because other similar nosql databases like Cassandra or Scylla do have this ability to delete an entire partition - but DynamoDB never implemented it.

So like you noted, you have no choice but to use a Query to retrieve the list of sort keys in this partition, and then delete them one by one.

You can optimize this operation a bit:

  1. You can ask Query to only return the sort key of each item instead of the full item.

    This can save you network bandwidth and client-side work, but won't save you any money - the Query's cost depends on the full size of the items, not on the part that it returns. This is less a problem than it appears, because anyway the cost the delete will be significantly higher than the cost of the query (a delete counts as a write, which is up to 40 times more expensive than a read).

  2. Instead of sending individual DeleteItem requests sequentially, which is very slow because of the request latency (you're always waiting for one delete to finish before starting the next one), you can send many deletes in parallel. You can do this in the client, or even better - use the server's BatchWriteItem which can invoke up to 100 deletions at once. This can really speed up your deletions.

This question has been asked many times on stackoverflow, and you can find many example codes. Here is one example: How to delete records in Amazon Dynamodb based on a hashkey?

Nadav Har'El
  • 11,785
  • 1
  • 24
  • 45