0

I have a python script in AWS Lambda that I started (very basic). I got results after a few tries and now I am trying to scan the data to determine if any of the "LastModified" dates are more than 4 hours old (based on the current date and time).

Is there any simple way to do that?

import boto3
import os
from datetime import datetime

def lambda_handler(event, context):
   s3 = boto3.client('s3')
   bucket = 'mybucket'
   resp = s3.list_objects_v2(Bucket=bucket, Prefix='JSON/')
   print(resp['Contents'])

Here is a sample of the response (list of dicts)

[{'Key': 'JSON/File1.json', 'LastModified': datetime.datetime(2019, 5, 28, 18, 11, 42, tzinfo=tzlocal()), 'ETag': '"d41d8cd98f00b204e9800998ecf8427e"', 'Size': 0, 'StorageClass': 'STANDARD'}, {'Key': 'JSON/File2.json', 'LastModified': datetime.datetime(2020, 8, 6, 12, 55, 9, tzinfo=tzlocal()), 'ETag': '"e8534a11ac08968619c05e28641a09b8"', 'Size': 7600141, 'StorageClass': 'STANDARD'}, {'Key': 'JSON/File3.json', 'LastModified': datetime.datetime(2020, 8, 6, 12, 56, 9, tzinfo=tzlocal()), 'ETag': '"bac4bfc4daa1f4a4982b9ec0c5f11c62"', 'Size': 38430159, 'StorageClass': 'STANDARD'}
MisterNox
  • 1,445
  • 2
  • 8
  • 22
user2232552
  • 133
  • 1
  • 1
  • 8
  • I have never worked with AWS but if you give me an example of the "lastmodified" variable and its type (e.g. string, datetime) I can give you an example of checking if it was modified within the last 4 hours – MisterNox Aug 06 '20 at 13:12
  • it looks like this when I get print the contents for the result -- 'LastModified': datetime.datetime(2019, 5, 28, 18, 11, 42, tzinfo=tzlocal()) However, if I just show the object str(obj['LastModified']) it looks like this --- 2020-08-06 12:55:35+00:00 --- it is definitely datetime type. -- ideally I want to only get the results that are over 4 hours with the "key" and how old they are that I can use to send an SNS message. If there are no results over 4 hours then do nothing.. etc. thanks for your help – user2232552 Aug 06 '20 at 13:33
  • I was actually able to get a difference between the current date time and the Last Modified date. but I don't know what to do with the results. time = datetime.now(timezone.utc) ---- str(time - obj['LastModified']) – user2232552 Aug 06 '20 at 13:36
  • if you want to filter all which are older than 4 hours it is important for me to see the obj dict strucutre. i need to see the keys and how they are ordered to give you a working example. Please add the dict or at least 2 or 3 elements in your question. like `obj = {...}` or if it is a list of dicts `obj = [{...}, ... , {...}]` – MisterNox Aug 06 '20 at 13:45
  • Ok I added the results. let me know if that helps – user2232552 Aug 06 '20 at 13:49
  • Added my solution – MisterNox Aug 06 '20 at 14:09

1 Answers1

1

This should work for the list of dicts you showed me. First of all I had problems with tzlocal(). I had to set the tzinfo of of my datetime.now() object to tzlocal() as well (reference), then it worked. Hope it helps you:

import datetime
from dateutil.tz import tzlocal

data = [
    {'Key': 'JSON/File1.json', 'LastModified': datetime.datetime(2019, 5, 28, 18, 11, 42, tzinfo=tzlocal()), 'ETag': '"d41d8cd98f00b204e9800998ecf8427e"', 'Size': 0, 'StorageClass': 'STANDARD'},
    {'Key': 'JSON/File2.json', 'LastModified': datetime.datetime(2020, 8, 6, 12, 55, 9, tzinfo=tzlocal()), 'ETag': '"e8534a11ac08968619c05e28641a09b8"', 'Size': 7600141, 'StorageClass': 'STANDARD'},
    {'Key': 'JSON/File3.json', 'LastModified': datetime.datetime(2020, 8, 6, 12, 56, 9, tzinfo=tzlocal()), 'ETag': '"bac4bfc4daa1f4a4982b9ec0c5f11c62"', 'Size': 38430159, 'StorageClass': 'STANDARD'}
]

filtered = list(filter(lambda x: x["LastModified"] < (datetime.datetime.now().replace(tzinfo=tzlocal()) - datetime.timedelta(hours=4)), data))

print(filtered)
#[{'Key': 'JSON/File1.json', 'LastModified': datetime.datetime(2019, 5, 28, 18, 11, 42, tzinfo=tzlocal()), 'ETag': '"d41d8cd98f00b204e9800998ecf8427e"', 'Size': 0, 'StorageClass': 'STANDARD'}]
MisterNox
  • 1,445
  • 2
  • 8
  • 22
  • for some reason when I try to use my actual response data instead of the pre-defined array I am getting this error. "type object 'datetime.datetime' has no attribute 'datetime'", – user2232552 Aug 06 '20 at 15:40
  • That's because you import `from datetime import datetime`. If you want to keep it like this change it like`from datetime import datetime, timedelta` and then delete one datetime in my filter function to get this lambda `lambda x: x["LastModified"] < (datetime.now().replace(tzinfo=tzlocal()) - timedelta(hours=4)` – MisterNox Aug 06 '20 at 16:07
  • I made these adjustments and now I am getting this error. "filter expected 2 arguments, got 1", This is how the full filter line looks --- filtered = list(filter(lambda x: x["LastModified"] < (datetime.now().replace(tzinfo=tzlocal()) - timedelta(hours=4),data))) – user2232552 Aug 06 '20 at 16:17
  • You made a mistake with the brackets. try this one instead `filtered = list(filter(lambda x: x["LastModified"] < (datetime.now().replace(tzinfo=tzlocal()) - timedelta(hours=4)), data))` – MisterNox Aug 06 '20 at 16:19