4

I am searching how to delete a S3 folder using AWS SDK for Java version 2. I only managed to find AWS SDK version 1 examples.

I know that S3 is an object store and that the concept of folder does not exist. What I mean here is :

  • List the S3 objects of a given bucket with a given prefix
  • Delete the returned objects using a DeleteObjectsRequest to be able to delete up to 1000 objects in a single HTTP call towards AWS API

When I am searching for examples, I constantly go back to this page : https://docs.aws.amazon.com/AmazonS3/latest/dev/DeletingMultipleObjectsUsingJava.html where it seems this is the version 1 of the AWS SDK for Java that is used. At least, on my side, I imported AWS SDK 2 and I cannot directly instantiate DeleteObjectsRequest as it is shown in this example. I am forced to use builders then I don't find the same methods to specify the list of keys to be deleted.

Comencau
  • 1,084
  • 15
  • 35

3 Answers3

5

I managed to make it work with the piece of code below.

But I find this way of doing quite cumbersome and I still would like to check with the community if this is the correct way of doing. I especially find quite cumbersome the need to go from a collection of S3Object to a collection of ObjectIdentifier and the chains of builders needed. Why DeleteObjectsRequest's builder does not simply allow to specify a collection of strings being the keys of the objects to be deleted ?

public static void deleteS3Objects(String bucket, String prefix) {
    ListObjectsV2Request request = ListObjectsV2Request.builder().bucket(bucket).prefix(prefix).build();
    ListObjectsV2Iterable list = s3Client.listObjectsV2Paginator(request);
    for (ListObjectsV2Response response : list) {
        List<S3Object> objects = response.contents();
        List<ObjectIdentifier> objectIdentifiers = objects.stream().map(o -> ObjectIdentifier.builder().key(o.key()).build()).collect(Collectors.toList());
        DeleteObjectsRequest deleteObjectsRequest = DeleteObjectsRequest.builder().bucket(bucket).delete(Delete.builder().objects(objectIdentifiers).build()).build();
        s3Client.deleteObjects(deleteObjectsRequest);
    }
}
Comencau
  • 1,084
  • 15
  • 35
  • This does not handle cases where the folder already exists, please see my version below – Bob Jan 11 '21 at 20:39
  • 1
    @Jasper Citi : The code above will send 1 request for each batch of 1000 objects. To delete, let's say 1200 objects, it will send 2 requests. Each DeleteObjectsRequest can delete up to 1000 objects in 1 request. – Comencau Aug 01 '22 at 18:28
  • 1
    @Comencau Sorry, I realize you are right. I found this doc that confirms it: https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html . – Jasper Citi Aug 03 '22 at 04:24
  • @Jasper Citi : Thank you. I was going to reply but I got lazy. Maybe what is also confusing is that the loop ListObjectsV2Response is in fact looping on batch of 1000 items. Then those items are added to one and only DeleteObjectsRequest which will delete 1000 objects in a single API call. By the way, the solution of Bob below will meet an error if there is more than 1000 objects as he puts everything (and potentially more than 1000 objects) in one single DeleteObjectsRequest. – Comencau Aug 03 '22 at 17:11
2

This is an improvement on @Comencau's helpful answer, it handles the case of no objects being found: MalformedXML: The XML you provided was not well-formed or did not validate against our published schema

public static void deleteS3Data(String bucket, String prefix) {
    S3Client s3Client = S3Client.builder().region(region).build();
    ListObjectsV2Request request = ListObjectsV2Request.builder().bucket(bucket).prefix(prefix).build();
    ListObjectsV2Iterable list = s3Client.listObjectsV2Paginator(request);

    List<ObjectIdentifier> objectIdentifiers = list
            .stream()
            .flatMap(r -> r.contents().stream())
            .map(o -> ObjectIdentifier.builder().key(o.key()).build())
            .collect(Collectors.toList());

    if (objectIdentifiers.isEmpty()) return;
    DeleteObjectsRequest deleteObjectsRequest = DeleteObjectsRequest
            .builder()
            .bucket(bucket)
            .delete(Delete.builder().objects(objectIdentifiers).build())
            .build();
    s3Client.deleteObjects(deleteObjectsRequest);
}
Bob
  • 689
  • 10
  • 11
  • Good point about checking for zero objects, however this solution will only work if you are deleting < 1000 objects. Unfortunately AWS has 1000 objects per request limitation as documented: https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html – Jasper Citi Aug 03 '22 at 04:26
  • Instead of `flatMap` you need a loop and multiple `DeleteObjectsRequest` like in the original answer. – wilmol May 23 '23 at 12:10
0

Combination of the existing answers; it accounts for both 'no objects' and the '1000 key limit'.

  void deleteFolder(String bucket, String prefix) {
    ListObjectsV2Request listRequest =
        ListObjectsV2Request.builder()
            .bucket(bucket)
            .prefix(prefix)
            .build();
    ListObjectsV2Iterable paginatedListResponse = s3Client.listObjectsV2Paginator(listRequest);

    for (ListObjectsV2Response listResponse : paginatedListResponse) {
      List<ObjectIdentifier> objects =
          listResponse.contents().stream()
              .map(s3Object -> ObjectIdentifier.builder().key(s3Object.key()).build())
              .toList();
      if (objects.isEmpty()) {
        break;
      }
      DeleteObjectsRequest deleteRequest =
          DeleteObjectsRequest.builder()
              .bucket(bucket)
              .delete(Delete.builder().objects(objects).build())
              .build();
      s3Client.deleteObjects(deleteRequest);
    }
  }
wilmol
  • 1,429
  • 16
  • 22