162

I need to write code in python that will delete the required file from an Amazon s3 bucket. I am able to connect to the Amazon s3 bucket, and also to save files, but how can I delete a file?

cmaher
  • 5,100
  • 1
  • 22
  • 34
Suhail
  • 2,847
  • 2
  • 19
  • 16

22 Answers22

172

Using boto3 (currently version 1.4.4) use S3.Object.delete().

import boto3

s3 = boto3.resource('s3')
s3.Object('your-bucket', 'your-key').delete()
Kohányi Róbert
  • 9,791
  • 4
  • 52
  • 81
  • 1
    If the object is not present will it throw an error? – Akash Tantri Feb 08 '19 at 06:07
  • 3
    @AkashTantri I haven't personally tried, but the doc says it _removes the null version (if there is one) [...] If there isn't a null version, Amazon S3 does not remove any objects._ So my guess is that it won't throw an error. If you happen to try it (just do something like `s3.Object('existing-bucket', 'bogus-key').delete()` and see what happens. Also try `s3.Object('bogus-bucket', 'bogus-key').delete()`. – Kohányi Róbert Feb 08 '19 at 17:28
  • 1
    Works like a charm , Thats the real power of python – yunus Feb 21 '19 at 15:15
  • 1
    Does the `your-key` here means the actual file name in `your-bucket` on S3? – Underoos Oct 10 '19 at 10:06
  • @SukumarRdjf Yes :) – Kohányi Róbert Oct 11 '19 at 16:05
  • if the object does not exist - no error is thrown. which is a bummer, would be nice to get the confirmation, or an error message, saying 'object doesn't exist'. meh – FlyingZebra1 Apr 15 '20 at 06:26
  • Just thought I'd pass this on: When I tried this passing your-key as s3://my-bucket-name/folder/subfolder/filename the delete() call returned success (HTTPStatusCode 204) but didn't actually delete the file. Then I tried as just folder/subfolder/filename and it worked - same HTTPStatusCode but really deleted the file. – mojoken Apr 21 '20 at 19:25
  • Yes the status code remains 204 whether file is present or not which is not ideally correct as when file doesn't exists it should return error status code or anything.So that user can differentiate and handle it properly. – Jaspreet Jolly Jun 08 '20 at 07:49
  • 3
    You can use it to check if the file exists before delete it: obj_exists = list(s3.Bucket('bucket').objects.filter(Prefix='key') if len(obj_exists) > 0 and obj_exists[0].key == 'key': s3.Object('bucket','key').delete() – dasilvadaniel Jun 12 '20 at 02:13
145

Using the Python boto3 SDK (and assuming credentials are setup for AWS), the following will delete a specified object in a bucket:

import boto3

client = boto3.client('s3')
client.delete_object(Bucket='mybucketname', Key='myfile.whatever')
Anconia
  • 3,888
  • 6
  • 36
  • 65
  • 7
    @Rob The boto3 documentation is misleading. it will create a delete marker if the object is versioned. It will delete the object otherwise. – jarmod Jun 06 '18 at 14:43
  • 1
    Clean and simple. Could be the accepted answer, and should definitely be merged with @Kohányi Róbert s answer as both are best approaches for the task. – PaulB Aug 12 '19 at 17:24
95

found one more way to do it using the boto:

from boto.s3.connection import S3Connection, Bucket, Key

conn = S3Connection(AWS_ACCESS_KEY, AWS_SECERET_KEY)

b = Bucket(conn, S3_BUCKET_NAME)

k = Key(b)

k.key = 'images/my-images/'+filename

b.delete_key(k)
Molomby
  • 5,859
  • 2
  • 34
  • 27
Suhail
  • 2,847
  • 2
  • 19
  • 16
  • 10
    If you wanted to delete EVERYTHING in a bucket, you could do: `for x in b.list(): b.delete_key(x.key)` – jontsai Jul 31 '12 at 17:02
  • 23
    I love how in my file it turns out to be `bucket.list()` – Artur Sapek Apr 10 '13 at 18:43
  • For this code snippet to work as presented, you'll need to import `Bucket` and `Key`, too. As in: `from boto.s3.connection import S3Connection, Bucket, Key` – Nick Chammas Apr 26 '14 at 01:46
  • I get `>>> from boto.s3.connection import S3Connection, Bucket, Key Traceback (most recent call last): File "", line 1, in ImportError: No module named boto.s3.connection` please update the answer to boto3 – Harry Moreno Apr 24 '17 at 20:50
  • 1
    figured it out and wrote up a solution http://harrymoreno.com/2017/04/24/How-to-fill-and-empty-an-s3-bucket-with-python.html – Harry Moreno Apr 25 '17 at 21:16
29

Welcome to 2020 here is the answer in Python/Django:

from django.conf import settings 
import boto3   
s3 = boto3.client('s3')
s3.delete_object(Bucket=settings.AWS_STORAGE_BUCKET_NAME, Key=f"media/{item.file.name}")

Took me far too long to find the answer and it was as simple as this.

Comm4nd0
  • 631
  • 8
  • 14
19

please try this code

import boto3   
s3 = boto3.client('s3')
s3.delete_object(Bucket="s3bucketname", Key="s3filepath")
9

Try to look for an updated method, since Boto3 might change from time to time. I used my_bucket.delete_objects():

import boto3
from boto3.session import Session

session = Session(aws_access_key_id='your_key_id',
                  aws_secret_access_key='your_secret_key')

# s3_client = session.client('s3')
s3_resource = session.resource('s3')
my_bucket = s3_resource.Bucket("your_bucket_name")

response = my_bucket.delete_objects(
    Delete={
        'Objects': [
            {
                'Key': "your_file_name_key"   # the_name of_your_file
            }
        ]
    }
)

Javi
  • 913
  • 2
  • 13
  • 15
5

I'm surprised there isn't this easy way : key.delete() :

from boto.s3.connection import S3Connection, Bucket, Key

conn = S3Connection(AWS_ACCESS_KEY, AWS_SECERET_KEY)
bucket = Bucket(conn, S3_BUCKET_NAME)
k = Key(bucket = bucket, name=path_to_file)
k.delete()
Cyril N.
  • 38,875
  • 36
  • 142
  • 243
4

Below is code snippet you can use to delete the bucket,

import boto3, botocore
from botocore.exceptions import ClientError

s3 = boto3.resource("s3",aws_access_key_id='Your-Access-Key',aws_secret_access_key='Your-Secret-Key')
s3.Object('Bucket-Name', 'file-name as key').delete()
Shivam Pandey
  • 3,756
  • 2
  • 19
  • 26
Ganesan J
  • 539
  • 7
  • 12
4

Use the S3FileSystem.rm function in s3fs.

You can delete a single file or several at once:

import s3fs
file_system = s3fs.S3FileSystem()

file_system.rm('s3://my-bucket/foo.txt')  # single file

files = ['s3://my-bucket/bar.txt', 's3://my-bucket/baz.txt']
file_system.rm(files)  # several files
moshevi
  • 4,999
  • 5
  • 33
  • 50
4

if you want to delete all files from s3 bucket in simplest way with couple of lines of code use this.

import boto3

s3 = boto3.resource('s3', aws_access_key_id='XXX', aws_secret_access_key= 'XXX')
bucket = s3.Bucket('your_bucket_name')
bucket.objects.delete()
Lakshman
  • 81
  • 3
3

Via which interface? Using the REST interface, you just send a delete:

DELETE /ObjectName HTTP/1.1
Host: BucketName.s3.amazonaws.com
Date: date
Content-Length: length
Authorization: signatureValue

Via the SOAP interface:

<DeleteObject xmlns="http://doc.s3.amazonaws.com/2006-03-01">
  <Bucket>quotes</Bucket>
  <Key>Nelson</Key>
  <AWSAccessKeyId> 1D9FVRAYCP1VJEXAMPLE=</AWSAccessKeyId>
  <Timestamp>2006-03-01T12:00:00.183Z</Timestamp>
  <Signature>Iuyz3d3P0aTou39dzbqaEXAMPLE=</Signature>
</DeleteObject>

If you're using a Python library like boto, it should expose a "delete" feature, like delete_key().

Randall Cook
  • 6,728
  • 6
  • 33
  • 68
T.J. Crowder
  • 1,031,962
  • 187
  • 1,923
  • 1,875
  • yes, i am using that python library, but will that delete, the file ? should i do it this way: k.key = 'images/anon-images/small/'+filename k.delete_key() is this correct ? please let me know. – Suhail Jun 29 '10 at 13:23
  • @Suhail: I haven't used that library, but from the source I linked, what it's actually doing is a `DELETE` call via the REST interface. So yes, despite the name "delete_key" (which I agree is unclear), it's really deleting the object *referenced* by the key. – T.J. Crowder Jun 29 '10 at 13:40
  • 2
    What about removing lot of files with a common prefix in name? Does S3 allow some bulk delete for such case, or deleting them one by one (which is slow) is the must? – Illarion Kovalchuk Jul 05 '10 at 10:11
  • @Shaman: I'm not an S3 expert, but as far as I *know*, you can only delete a specific file. But you probably want to actually ask that as a question so it gets attention from S3 experts. – T.J. Crowder Jul 05 '10 at 12:17
  • Right after commenting here I've added such a question. It has 2 views yet :) – Illarion Kovalchuk Jul 06 '10 at 09:38
3

2021 update- I had a hard time on this but it was as simple as doing.

  def delete_object(self,request):
     s3 = boto3.resource('s3',
         aws_access_key_id=AWS_UPLOAD_ACCESS_KEY_ID,
         aws_secret_access_key= AWS_UPLOAD_SECRET_KEY,
     )
     s3.Object('your-bucket', 'your-key}').delete()

make sure to add the credentials in your boto3 resource

Dharman
  • 30,962
  • 25
  • 85
  • 135
2

Simplest way to do this is:

import boto3
s3 = boto3.resource("s3")
bucket_source = {
            'Bucket': "my-bcuket",
            'Key': "file_path_in_bucket"
        }
s3.meta.client.delete(bucket_source)
KayV
  • 12,987
  • 11
  • 98
  • 148
2

Here is how i did this

"""
This is module which contains all classes related to aws S3
"""
"""
    awshelper.py
    -------

    This module contains the AWS class

"""

try:

    import re
    import os
    import json
    import boto3
    import datetime
    import uuid
    import math
    from boto3.s3.transfer import TransferConfig
    import threading
    import sys

    from tqdm import tqdm
except Exception as e:
    print("Error : {} ".format(e))

DATA = {
    "AWS_ACCESS_KEY": "XXXXXXXXXXXX",
    "AWS_SECRET_KEY": "XXXXXXXXXXXXX",
    "AWS_REGION_NAME": "us-east-1",
    "BUCKET": "XXXXXXXXXXXXXXXXXXXX",
}

for key, value in DATA.items():os.environ[key] = str(value)

class Size:
    @staticmethod
    def convert_size(size_bytes):

        if size_bytes == 0:
            return "0B"
        size_name = ("B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB")
        i = int(math.floor(math.log(size_bytes, 1024)))
        p = math.pow(1024, i)
        s = round(size_bytes / p, 2)
        return "%s %s" % (s, size_name[i])

class ProgressPercentage(object):
    def __init__(self, filename, filesize):
        self._filename = filename
        self._size = filesize
        self._seen_so_far = 0
        self._lock = threading.Lock()

    def __call__(self, bytes_amount):
        def convertSize(size):
            if (size == 0):
                return '0B'
            size_name = ("B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB")
            i = int(math.floor(math.log(size,1024)))
            p = math.pow(1024,i)
            s = round(size/p,2)
            return '%.2f %s' % (s,size_name[i])

        # To simplify, assume this is hooked up to a single filename
        with self._lock:
            self._seen_so_far += bytes_amount
            percentage = (self._seen_so_far / self._size) * 100
            sys.stdout.write(
                "\r%s  %s / %s  (%.2f%%)        " % (
                    self._filename, convertSize(self._seen_so_far), convertSize(self._size),
                    percentage))
            sys.stdout.flush()

class ProgressPercentageUpload(object):

    def __init__(self, filename):
        self._filename = filename
        self._size = float(os.path.getsize(filename))
        self._seen_so_far = 0
        self._lock = threading.Lock()

    def __call__(self, bytes_amount):
        # To simplify, assume this is hooked up to a single filename
        with self._lock:
            self._seen_so_far += bytes_amount
            percentage = (self._seen_so_far / self._size) * 100
            sys.stdout.write(
                "\r%s  %s / %s  (%.2f%%)" % (
                    self._filename, self._seen_so_far, self._size,
                    percentage))
            sys.stdout.flush()

class AWSS3(object):

    """Helper class to which add functionality on top of boto3 """

    def __init__(self, bucket, aws_access_key_id, aws_secret_access_key, region_name):

        self.BucketName = bucket
        self.client = boto3.client(
            "s3",
            aws_access_key_id=aws_access_key_id,
            aws_secret_access_key=aws_secret_access_key,
            region_name=region_name,
        )

    def get_length(self, Key):
        response = self.client.head_object(Bucket=self.BucketName, Key=Key)
        size = response["ContentLength"]
        return {"bytes": size, "size": Size.convert_size(size)}

    def put_files(self, Response=None, Key=None):
        """
        Put the File on S3
        :return: Bool
        """
        try:

            response = self.client.put_object(
                ACL="private", Body=Response, Bucket=self.BucketName, Key=Key
            )
            return "ok"
        except Exception as e:
            print("Error : {} ".format(e))
            return "error"

    def item_exists(self, Key):
        """Given key check if the items exists on AWS S3 """
        try:
            response_new = self.client.get_object(Bucket=self.BucketName, Key=str(Key))
            return True
        except Exception as e:
            return False

    def get_item(self, Key):

        """Gets the Bytes Data from AWS S3 """

        try:
            response_new = self.client.get_object(Bucket=self.BucketName, Key=str(Key))
            return response_new["Body"].read()

        except Exception as e:
            print("Error :{}".format(e))
            return False

    def find_one_update(self, data=None, key=None):

        """
        This checks if Key is on S3 if it is return the data from s3
        else store on s3 and return it
        """

        flag = self.item_exists(Key=key)

        if flag:
            data = self.get_item(Key=key)
            return data

        else:
            self.put_files(Key=key, Response=data)
            return data

    def delete_object(self, Key):

        response = self.client.delete_object(Bucket=self.BucketName, Key=Key,)
        return response

    def get_all_keys(self, Prefix="", max_page_number=100):

        """
        :param Prefix: Prefix string
        :return: Keys List
        """
        try:
            paginator = self.client.get_paginator("list_objects_v2")
            pages = paginator.paginate(Bucket=self.BucketName, Prefix=Prefix)

            tmp = []

            for page_no, page in enumerate(pages):
                if page_no >max_page_number:break
                print("page_no : {}".format(page_no))
                for obj in page["Contents"]:
                    tmp.append(obj["Key"])

            return tmp
        except Exception as e:
            return []

    def print_tree(self):
        keys = self.get_all_keys()
        for key in keys:
            print(key)
        return None

    def find_one_similar_key(self, searchTerm=""):
        keys = self.get_all_keys()
        return [key for key in keys if re.search(searchTerm, key)]

    def __repr__(self):
        return "AWS S3 Helper class "

    def download_file_locally(self, key, filename):
        try:
            response = self.client.download_file(
                Bucket=self.BucketName,
                Filename=filename,
                Key=key,
                Callback=ProgressPercentage(filename,
                                            (self.client.head_object(Bucket=self.BucketName,
                                                                     Key=key))["ContentLength"]),
                Config=TransferConfig(
                    max_concurrency=10,
                    use_threads=True,
                )
            )
            return True
        except Exception as e:
            print("Error Download file : {}".format(e))
            return False

    def upload_files_from_local(self, file_name, key):

        try:

            response = self.client.upload_file(
                Filename=file_name,
                Bucket=self.BucketName ,
                Key = key,
                Callback=ProgressPercentageUpload(file_name),
                Config=TransferConfig(
                    max_concurrency=10,
                    use_threads=True,
                ))
            return True
        except Exception as e:
            print("Error upload : {} ".format(e))
            return False


def batch_objects_delete_threadded(batch_size=50, max_page_size=100):
    helper_qa = AWSS3(
        aws_access_key_id=os.getenv("AWS_ACCESS_KEY"),
        aws_secret_access_key=os.getenv("XXXXXXXXXXXXX"),
        region_name=os.getenv("AWS_REGION_NAME"),
        bucket=os.getenv("BUCKET"),
    )

    keys = helper_qa.get_all_keys(Prefix="database=XXXXXXXXXXX/", max_page_number=max_page_size)
    MainThreads = [threading.Thread(target=helper_qa.delete_object, args=(key, )) for key in keys]

    print("Length: keys : {} ".format(len(keys)))
    for thread in tqdm(range(0, len(MainThreads), batch_size)):
        for t in MainThreads[thread: thread + batch_size]:t.start()
        for t in MainThreads[thread: thread + batch_size] : t.join()

# ==========================================
start = datetime.datetime.now()
batch_objects_delete_threadded()
end = datetime.datetime.now()
print("Execution Time : {} ".format(end-start))
# ==========================================




Soumil Nitin Shah
  • 634
  • 2
  • 7
  • 18
1

For now I have resolved the issue by using the Linux utility s3cmd. I used it like this in Python:

delFile = 's3cmd -c /home/project/.s3cfg del s3://images/anon-images/small/' + filename
os.system(delFile)
tsleyson
  • 319
  • 5
  • 18
Suhail
  • 2,847
  • 2
  • 19
  • 16
  • 1
    It's not exactly pythonic to invoke a subshell to communicate with S3 (a library or a direct HTTP transaction would be more elegant), but it still works. I don't think it deserves a downvote. +1. – Randall Cook Apr 18 '13 at 18:34
1

It's worked for me try it.

import boto
import sys
from boto.s3.key import Key
import boto.s3.connection

AWS_ACCESS_KEY_ID = '<access_key>'
AWS_SECRET_ACCESS_KEY = '<secret_access_key>'
Bucketname = 'bucket_name' 

conn = boto.s3.connect_to_region('us-east-2',
        aws_access_key_id=AWS_ACCESS_KEY_ID,
        aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
        is_secure=True,              
        calling_format = boto.s3.connection.OrdinaryCallingFormat(),
        )
bucket = conn.get_bucket(Bucketname)

k = Key(bucket)

k.key = 'filename to delete'
bucket.delete_key(k)   
muthu
  • 148
  • 1
  • 11
1

if you are trying to delete file using your own local host console then you can try running this python script assuming that you have have already assigned your access id and secret key in the system

import boto3

#my custom sesssion
aws_m=boto3.session.Session(profile_name="your-profile-name-on-local-host")
client=aws_m.client('s3')

#list bucket objects before deleting 
response = client.list_objects(
    Bucket='your-bucket-name'
)
for x in response.get("Contents", None):
    print(x.get("Key",None));

#delete bucket objects
response = client.delete_object(
    Bucket='your-bucket-name',
    Key='mydocs.txt'
)

#list bucket objects after deleting
response = client.list_objects(
    Bucket='your-bucket-name'
)
for x in response.get("Contents", None):
    print(x.get("Key",None));
MartenCatcher
  • 2,713
  • 8
  • 26
  • 39
1

cloudpathlib wraps boto3 in a pathlib interface so it becomes as easy to do tasks like this as working with local files.

First, be sure to be authenticated properly with an ~/.aws/credentials file or environment variables set. See more options in the cloudpathlib docs.

Then, using the unlink method like you would in pathlib. For directories, cloudpathlib also provides an rmtree method.

from cloudpathlib import CloudPath

# create a few files to work with
cl1 = CloudPath("s3://test-bucket/so/test_dir/f1.txt")
cl2 = CloudPath("s3://test-bucket/so/test_dir/f2.txt")
cl3 = CloudPath("s3://test-bucket/so/test_dir/f3.txt")

# write content to these files
cl1.write_text("hello file 1")
cl2.write_text("hello file 2")
cl3.write_text("hello file 3")

# show these file exist on S3
list(CloudPath("s3://test-bucket/so/test_dir/").iterdir())
#> [ S3Path('s3://test-bucket/so/test_dir/f1.txt'),
#>   S3Path('s3://test-bucket/so/test_dir/f2.txt'),
#>   S3Path('s3://test-bucket/so/test_dir/f3.txt')]

# remove a single file with `unlink`
cl1.unlink()

list(CloudPath("s3://test-bucket/so/test_dir/").iterdir())
#> [ S3Path('s3://test-bucket/so/test_dir/f2.txt'),
#>   S3Path('s3://test-bucket/so/test_dir/f3.txt')]

# remove a directory with `rmtree`
CloudPath("s3://test-bucket/so/test_dir/").rmtree()

# no more files
list(CloudPath("s3://test-bucket/so/").iterdir())
#> []
hume
  • 2,413
  • 19
  • 21
1

Delete files from folder in S3

client = boto3.client('s3')
response = client.list_objects(
    Bucket='bucket_name',
    Prefix='folder_name/'
)
obj_list = []
for data in response.get('Contents', []):
    print('res', data.get('Key'))
    obj_list.append({'Key': data.get('Key')})
if obj_list:
    response = client.delete_objects(
        Bucket='buket_name',
        Delete={'Objects': obj_list}
    )
    print('response', response)
0

you can do it using aws cli : https://aws.amazon.com/cli/ and some unix command.

this aws cli commands should work:

aws s3 rm s3://<your_bucket_name> --exclude "*" --include "<your_regex>" 

if you want to include sub-folders you should add the flag --recursive

or with unix commands:

aws s3 ls s3://<your_bucket_name>/ | awk '{print $4}' | xargs -I%  <your_os_shell>   -c 'aws s3 rm s3:// <your_bucket_name>  /% $1'

explanation:

  1. list all files on the bucket --pipe-->
  2. get the 4th parameter(its the file name) --pipe--> // you can replace it with linux command to match your pattern
  3. run delete script with aws cli
ggcarmi
  • 458
  • 4
  • 17
0

The following worked for me (based on example for a Django model, but you can pretty much use the code of the delete method on its own).

import boto3
from boto3.session import Session
from django.conf import settings

class Video(models.Model):
    title=models.CharField(max_length=500)
    description=models.TextField(default="")
    creation_date=models.DateTimeField(default=timezone.now)
    videofile=models.FileField(upload_to='videos/', null=True, verbose_name="")
    tags = TaggableManager()

    actions = ['delete']

    def __str__(self):
        return self.title + ": " + str(self.videofile)

    def delete(self, *args, **kwargs):
        session = Session (settings.AWS_ACCESS_KEY_ID, settings.AWS_SECRET_ACCESS_KEY)
        s3_resource = session.resource('s3')
        s3_bucket = s3_resource.Bucket(settings.AWS_STORAGE_BUCKET_NAME)

        file_path = "media/" + str(self.videofile)
        response = s3_bucket.delete_objects(
            Delete={
                'Objects': [
                    {
                        'Key': file_path
                    }
                ]
            })
        super(Video, self).delete(*args, **kwargs)
MadPhysicist
  • 5,401
  • 11
  • 42
  • 107
0

S3 connection with credential

import boto3
import pandas as pd
import io

s3 = boto3.resource('s3', endpoint_url='',
  aws_access_key_id = '',
  aws_secret_access_key = '')

Delete existing key

s3.Object('BUCKET_NAME', 'FILE_NAME_AS_KEY').delete()