142

I created a folder in s3 named "test" and I pushed "test_1.jpg", "test_2.jpg" into "test".

How can I use boto to delete folder "test"?

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
wade huang
  • 1,435
  • 2
  • 11
  • 7

8 Answers8

339

Here is 2018 (almost 2019) version:

s3 = boto3.resource('s3')
bucket = s3.Bucket('mybucket')
bucket.objects.filter(Prefix="myprefix/").delete()
Raz
  • 7,508
  • 4
  • 27
  • 25
  • 8
    someone might find useful to know that bucket.objects.all().delete() empties the entire bucket without deleting it, no matter how many objects there are (i.e. it's not affected but the 1000 items limit). See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Bucket.objects – fabiog1901 Dec 09 '19 at 15:48
  • 3
    Hi Raz this isn't working for me, I simply get empty square brackets, i.e. [] – Whitey Winn Dec 11 '19 at 18:23
  • 1
    Sadly, this doesn't supports Suffix :( – Anum Sheraz May 06 '20 at 16:50
  • 5
    The great thing is that this solution works even with more than 1000 objects – Mabyn Sep 15 '20 at 16:36
  • In case you need to specify an access key id and secret, the first line is: s3 = boto3.resource('s3', aws_access_key_id=xxx, aws_secret_access_key=yyy) – Douglas Daly Sep 28 '20 at 23:39
  • 1
    @raz If possible Could you update the answer to reflect/print what is being deleted, and to exit the program if the token expires? Right now it works well but there's no way of knowing what is being deleted and what is happening as I only get a blinking cursor – DJ_Stuffy_K Dec 19 '20 at 13:59
  • @Soyf - i'm getting the same, and nothing is deleted. what am i doing wrong? – doyouevendata Aug 16 '22 at 08:06
  • Can we use wild card characters in filter expression? ```bucket.objects.filter(Prefix="2021*/").delete()``` or will have to use aws wrangler for it? – Codistan Jun 12 '23 at 15:05
71

There are no folders in S3. Instead, the keys form a flat namespace. However a key with slashes in its name shows specially in some programs, including the AWS console (see for example Amazon S3 boto - how to create a folder?).

Instead of deleting "a directory", you can (and have to) list files by prefix and delete. In essence:

for key in bucket.list(prefix='your/directory/'):
    key.delete()

However the other accomplished answers on this page feature more efficient approaches.


Notice that the prefix is just searched using dummy string search. If the prefix were your/directory, that is, without the trailing slash appended, the program would also happily delete your/directory-that-you-wanted-to-remove-is-definitely-not-t‌​his-one.

For more information, see S3 boto list keys sometimes returns directory key.

thumbtackthief
  • 6,093
  • 10
  • 41
  • 87
  • 1
    How to delete the directory? If this directory will be deleted automatically when all files in this directory are deleted? – wade huang Jul 11 '12 at 07:40
  • @wadehuang - could you share your code about deleting folders? – letsc Mar 17 '15 at 21:33
  • How to delete files in folder of s3 that are 2 days old in python. have this in my s3 - bucket/1/backups/(10 files) need to remove all files that are two days old – 艾瑪艾瑪艾瑪 Jul 29 '19 at 07:45
59

I feel that it's been a while and boto3 has a few different ways of accomplishing this goal. This assumes you want to delete the test "folder" and all of its objects Here is one way:

s3 = boto3.resource('s3')
objects_to_delete = s3.meta.client.list_objects(Bucket="MyBucket", Prefix="myfolder/test/")

delete_keys = {'Objects' : []}
delete_keys['Objects'] = [{'Key' : k} for k in [obj['Key'] for obj in objects_to_delete.get('Contents', [])]]

s3.meta.client.delete_objects(Bucket="MyBucket", Delete=delete_keys)

This should make two requests, one to fetch the objects in the folder, the second to delete all objects in said folder.

https://boto3.readthedocs.org/en/latest/reference/services/s3.html#S3.Client.delete_objects

Patrick
  • 1,079
  • 1
  • 9
  • 14
  • 3
    This is the fastest solution, but keep in mind that `list_objects` can't return more than 1000 keys so you need to run this code multiple times. – lampslave Apr 19 '16 at 11:04
  • 4
    You can use paginator if you have more than 1k objects - see my answer below. – dmitrybelyakov Apr 16 '17 at 11:33
  • @deepelement, and it only works in `boto3`, not boto – avocado Sep 11 '17 at 02:50
  • 2
    This works great, and you can run it from a Python lambda by putting the code above in a lambda_handler function: `import boto3; def lambda_handler(event, context): '''Code from above'''`. Make sure you give your Lambda permission to delete from S3 and extend the timeout. – Nadir Sidi Jul 27 '18 at 19:41
31

A slight improvement on Patrick's solution. As you might know, both list_objects() and delete_objects() have an object limit of 1000. This is why you have to paginate listing and delete in chunks. This is pretty universal and you can give Prefix to paginator.paginate() to delete subdirectories/paths

client = boto3.client('s3', **credentials)
paginator = client.get_paginator('list_objects_v2')
pages = paginator.paginate(Bucket=self.bucket_name)

delete_us = dict(Objects=[])
for item in pages.search('Contents'):
    delete_us['Objects'].append(dict(Key=item['Key']))

    # flush once aws limit reached
    if len(delete_us['Objects']) >= 1000:
        client.delete_objects(Bucket=bucket, Delete=delete_us)
        delete_us = dict(Objects=[])

# flush rest
if len(delete_us['Objects']):
    client.delete_objects(Bucket=bucket, Delete=delete_us)
dmitrybelyakov
  • 3,709
  • 2
  • 22
  • 26
  • 2
    And if you want to limit to a "directory" use the `Prefix` keyword in `paginator.paginate()` See all options: http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Paginator.ListObjectsV2.paginate – Chad Jun 05 '18 at 18:55
  • 3
    with the `Prefix` filter suggested by **@Chad**, I had to add a `if item is not None` check before deletion (since some of my S3 prefixes did not exist / had no objects) – y2k-shubham Sep 24 '19 at 06:38
  • @dmitraybelyakov, when i run the above code I am getting Typeerror: 'NoneType' object is not scbscriptable on the follwong line delete_us['Objects'].append(dict(key=item['Key'])) would you know any reason why it would do that – Aaron Jun 16 '21 at 17:03
  • @Aaron perhaps some things have changed, but try suggestion above by y2k-shubham – dmitrybelyakov Jun 16 '21 at 18:27
  • Works like a charm! P.S- Don't forget to add Prefix as mentioned above. – Rajnish kumar Dec 07 '22 at 10:15
24

You can use bucket.delete_keys() with a list of keys (with a large number of keys I found this to be an order of magnitude faster than using key.delete).

Something like this:

delete_key_list = []
for key in bucket.list(prefix='/your/directory/'):
    delete_key_list.append(key)
    if len(delete_key_list) > 100:
        bucket.delete_keys(delete_key_list)
        delete_key_list = []

if len(delete_key_list) > 0:
    bucket.delete_keys(delete_key_list)
David Fooks
  • 1,063
  • 10
  • 10
9

If versioning is enabled on the S3 bucket:

s3 = boto3.resource('s3')
bucket = s3.Bucket('mybucket')
bucket.object_versions.filter(Prefix="myprefix/").delete()
Dan-Dev
  • 8,957
  • 3
  • 38
  • 55
  • Is there a way to print someouput of what is being deleted? I want to delete the versions first and then the current one. ex bucket.objects.filter(Prefix="myprefix/").delete() ; right now I only see a blinking cursor and I don't know what is happening. – DJ_Stuffy_K Dec 09 '20 at 22:50
  • 1
    You would have to do something like `files_to_delete = bucket.object_versions.filter(Prefix="myprefix/")` then iterate over `files_to_delete` and call print() then delete() on them. – Dan-Dev Dec 10 '20 at 09:51
2

If one needs to filter by object contents like I did, the following is a blueprint for your logic:

def get_s3_objects_batches(s3: S3Client, **base_kwargs):
    kwargs = dict(MaxKeys=1000, **base_kwargs)
    while True:
        response = s3.list_objects_v2(**kwargs)
        # to yield each and every file: yield from response.get('Contents', [])
        yield response.get('Contents', [])
        if not response.get('IsTruncated'):  # At the end of the list?
            break
        continuation_token = response.get('NextContinuationToken')
        kwargs['ContinuationToken'] = continuation_token


def your_filter(b):
   raise NotImplementedError()


session = boto3.session.Session(profile_name=profile_name)
s3client = session.client('s3')
for batch in get_s3_objects_batches(s3client, Bucket=bucket_name, Prefix=prefix):
    to_delete = [{'Key': obj['Key']} for obj in batch if your_filter(obj)]
    if to_delete:
        s3client.delete_objects(Bucket=bucket_name, Delete={'Objects': to_delete})
Boris
  • 471
  • 4
  • 8
0

#Deleting a Files Inside Folder S3 using boto3#

def delete_from_minio():
    """
    This function is used to delete files or folder inside the another Folder
    """
    try:
        logger.info("Deleting from minio")
        aws_access_key_id='Your_aws_acess_key'
        aws_secret_access_key = 'Your_aws_Secret_key'
        host = 'your_aws_endpoint'
        s3 = boto3.resource('s3', aws_access_key_id=aws_access_key_id,
                            aws_secret_access_key=aws_secret_access_key ,
                            config=boto3.session.Config(signature_version='your_version'),
                            region_name="your_region",
                            endpoint_url=host ,
                            verify=False)
        bucket = s3.Bucket('Your_bucket_name')
        for obj in bucket.objects.filter(Prefix='Directory/Sub_Directory'):
            s3.Object(bucket.name, obj.key).delete()
    except Exception as e:
        print(f"Error Occurred while deleting from the S3,{str(e)}")

Hope this Helps :)