0

I am a beginner with Python and I have written a python script which takes a snaphot of a specified volume and then retains only the number of snapshots requested for that volume.

#Built with Python 3.3.2
import boto.ec2
from boto.ec2.connection import EC2Connection
from boto.ec2.regioninfo import RegionInfo
from boto.ec2.snapshot import Snapshot
from datetime import datetime
from functools import cmp_to_key
import sys

aws_access_key = str(input("AWS Access Key: "))
aws_secret_key = str(input("AWS Secret Key: "))
regionname = str(input("AWS Region Name: "))
regionendpoint = str(input("AWS Region Endpoint: "))
region = RegionInfo(name=regionname, endpoint=regionendpoint)
conn = EC2Connection(aws_access_key_id = aws_access_key, aws_secret_access_key = aws_secret_key, region = region)
print (conn)

volumes = conn.get_all_volumes()
print ("%s" % repr(volumes))

vol_id = str(input("Enter Volume ID to snapshot: "))
keep = int(input("Enter number of snapshots to keep:  "))
volume = volumes[0]
description = str(input("Enter volume snapshot description: "))


if volume.create_snapshot(description):
    print ('Snapshot created with description: %s' % description)

snapshots = volume.snapshots()
print (snapshots)

def date_compare(snap1, snap2):
    if snap1.start_time < snap2.start_time:
        return -1
    elif snap1.start_time == snap2.start_time:
        return 0
    return 1

snapshots.sort(key=cmp_to_key(date_compare))
delta = len(snapshots) - keep
for i in range(delta):
    print ('Deleting snapshot %s' % snapshots[i].description)
    snapshots[i].delete()

What I want to do now is rather than use the number of snapshots to keep I want to change this to specifying the date range of the snapshots to keep. For example delete anything older than a specific date & time. I kind of have an idea where to start and based on the above script I have the list of snapshots sorted by date. What I would like to do is prompt the user to specify the date and time from where snapshots would be deleted eg 2015-3-4 14:00:00 anything older than this would be deleted. Hoping someone can get me started here

Thanks!!

stickells
  • 41
  • 2
  • 4

2 Answers2

1

First, you can prompt user to specify the date and time from when snapshots would be deleted.

import datetime
user_time = str(input("Enter datetime from when you want to delete, like this format 2015-3-4 14:00:00:"))
real_user_time = datetime.datetime.strptime(user_time, '%Y-%m-%d %H:%M:%S')
print real_user_time  # as you can see here, user time has been changed from a string to a datetime object

Second, delete anything older than that

SOLUTION ONE:

for snap in snapshots:
    start_time = datetime.datetime.strptime(snap.start_time[:-5], '%Y-%m-%dT%H:%M:%S')
    if start_time > real_user_time:
        snap.delete()

SOLUTION TWO:

Since snapshots is sorted, you only find the first snap older than real_user_time and delete all the rest of them.

snap_num = len(snapshots)
for i in xrange(snap_num):
    # if snapshots[i].start_time is not the format of datetime object, you will have to format it first like above
    start_time = datetime.datetime.strptime(snapshots[i].start_time[:-5], '%Y-%m-%dT%H:%M:%S')
    if start_time > real_user_time:
        for n in xrange(i,snap_num):
            snapshots[n].delete()
        break

Hope it helps. :)

Stephen Lin
  • 4,852
  • 1
  • 13
  • 26
  • Additionally----since the list is sorted, use binary search for positioning `real_user_time` and strip off all later snapshots. – Barun Sharma Mar 04 '15 at 06:57
  • @BarunSharma Yes, that's reasonable. Check my update. – Stephen Lin Mar 04 '15 at 07:08
  • OK super close but stuck again, as you commented it i do need to format the `snap.start_time` as a datetime object since I get the following error when running the script `if snap.start_time > real_user_time: TypeError: unorderable types: str() > datetime.datetime()` I'm not quite sure how to format snap.start_time as a datetime object. Tried a few things but none seem to work, pretty sure I am doing something really simple wrong. Thanks! – stickells Mar 04 '15 at 07:49
  • @stickells Use print snapshots[i].start_time to show the format of start time string. Tell me. – Stephen Lin Mar 04 '15 at 07:51
  • apologies but I am still not quite there the `print snapshots[i].start_time` gives the following output `2015-03-04T06:35:18.000Z` the script errors with this `if snapshots[i].start_time > real_user_time: TypeError: unorderable types: str() > datetime.datetime()` Thanks for the help!! – stickells Mar 04 '15 at 08:22
  • OK so this now works but main problem is it is deleting the latest snapshots and not the oldest eg 5 snapshots with start_time of 15:00 through to 12:00 the script runs and creates a new snapshot with start_time of 15:15. I want to delete the older snapshots so I enter 15:10:00 as the time but it will then delete the newest and keep the old snapshots. Tried changing from this `if start_time > real_user_time:` to this `if start_time < real_user_time:` but that caused all of the snapshots to be deleted. Sure I am missing something again here, sorry for the basic questions still learning!! – stickells Mar 04 '15 at 09:40
  • @stickells I think code in SOLUTION ONE is easy for you. Check my update. – Stephen Lin Mar 04 '15 at 11:55
  • @stickells: what does `print(repr(snapshots[i].start_time))` show? – jfs Mar 04 '15 at 12:41
0

Be careful. Make sure to normalize the start time values (e.g., convert them to UTC). It doesn't make sense to compare the time in user local timezone with whatever timezone is used on the server. Also the local timezone may have different utc offsets at different times anyway. See Find if 24 hrs have passed between datetimes - Python.

If all dates are in UTC then you could sort the snapshots as:

from operator import attrgetter

snapshots.sort(key=attrgetter('start_time'))

If snapshots is sorted then you could "delete anything older than a specific date & time" using bisect module:

from bisect import bisect

class Seq(object):
    def __init__(self, seq):
        self.seq = seq
    def __len__(self):
        return len(self.seq)
    def __getitem__(self, i):
        return self.seq[i].start_time

del snapshots[:bisect(Seq(snapshots), given_time)]

it removes all snapshots with start_time <= given_time.


You could also remove older snapshots without sorting:

snapshots[:] = [s for s in snapshots if s.start_time > given_time]

If you want to call .delete() method explicitly without changing snapshots list:

for s in snapshots:
    if s.start_time <= given_time:
        s.delete()

If s.start_time is a string that uses 2015-03-04T06:35:18.000Z format then given_time should also be in that format (note: Z here means that the time is in UTC) if user uses a different timezone; you have to convert the time before comparison (str -> datetime -> datetime in utc -> str). If given_time is already a string in the correct format then you could compare the string directly without converting them to datetime first.

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670