1

I have one bucket in AWS that has files regularly uploaded to it. There is a policy that this bucket cannot have a lifecycle rules attached.

I'm looking for a lambda that will remove objects older than 2 weeks. I know the timedelta library can be used for comparing dates, but I can't figure out how I can use this to check if an object is over 2 weeks old (I'm new to python).

So far I have:

import boto3
import datetime

s3 = boto3.resource('s3')

now = datetime.datetime.now()
now_format = int(now.strftime("%d%m%Y"))
print(f'it is now {now_format}')

# Get bucket object
my_bucket = s3.Bucket('cost-reports')
all_objects = my_bucket.objects.all()

for each_object in all_objects:
    obj_int = int(each_object.last_modified.strftime('%d%m%Y'))

    print("The object {} was last modified on the {}".format(
        each_object.key, obj_int))

so this is just using the strftime comparison, but will this actually work as well? or do I have to use the timedelta module and how would this look?

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
scrow
  • 357
  • 3
  • 13
  • Just as a general observation: You're implementing a potentially very costly workaround instead of using a very cheap and simple feature, maybe it's more prudent to challenge the policy that no lifecycle rules can be attached. – Maurice Jun 21 '21 at 09:02
  • Your function is missing a lambda handler. – Marcin Jun 21 '21 at 09:14
  • hi @Marcin, was just testing this locally first, hence no handler but thanks for checking :) – scrow Jun 22 '21 at 08:10
  • 1
    Hi @Maurice , I have taken your advice and asked for a change. Looks like our account will be given a dedicated bucket for these reports that can have a lifecycle rule. Thanks for suggesting that :) – scrow Jun 22 '21 at 08:10

2 Answers2

1

Your each_object.last_modified is datetime object, just like now.

So to calculate number of days from last modification, it should be as simple as:

now = datetime.datetime.now().astimezone()
last_modified_days_ago = (now - each_object.last_modified).days
Marcin
  • 215,873
  • 14
  • 235
  • 294
  • thank you Marcin, so when I try that inside the for loop, I get back the error: "TypeError: can't subtract offset-naive and offset-aware datetimes". Do happen to know what this is? – scrow Jun 21 '21 at 09:42
  • 1
    @scrow I updated the answer based on https://stackoverflow.com/a/64860559/248823 – Marcin Jun 21 '21 at 10:21
  • 1
    fantastic, thank you very much @Marcin , that has worked – scrow Jun 22 '21 at 08:07
0

You can use:

from datetime import datetime, timedelta
from dateutil.tz import tzutc, UTC

...

for object in bucket.objects.all():
    if object.last_modified > datetime.now(tzutc()) - timedelta(days = 14):
        <Do something here>

Code copied from: Enhance Python script to download Amazon S3 files created in last 24 hours

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470