0

I am trying to archive old files based on creation date. I have data starting from 12-17-2010 so i am setting this as base date and incrementing from there. Here is my code

import os, time, tarfile
from datetime import datetime, date, timedelta
import datetime

path = "/home/appins/.scripts/test/"
count = 0
set_date = '2010-12-17'
date = datetime.datetime.strptime(set_date, '%Y-%m-%d')

while (count < 2):
    date += datetime.timedelta(days=1)
    tar_file = "nas_archive_"+date.strftime('%m-%d-%y')+".tgz"
    log_file = "archive_log_"+date.strftime('%m-%d-%y')
    fcount = 0
    f = open(log_file,'ab+')
    #print date.strftime('%m-%d-%y')
    for root, subFolders, files in os.walk(path):
        for file in files:
            file = os.path.join(root,file)
            file = os.path.join(path, file)
            filecreation = os.path.getctime(file)
            print datetime.fromtimestamp(filecreation)," File Creation Date"
            print date.strftime('%m-%d-%y')," Base Date"
            if filecreation == date:
                tar.add(file)
                f.write(file + '\n')
                print file," is of matching date"
                fcount = fcount + 1
    f.close()
    count += 1

filecreation variable is getting float value. How can I use it to compare with my base date?

user2501825
  • 23
  • 2
  • 8
  • 1
    For the record, `ctime` is not file creation date... – twalberg Aug 07 '13 at 16:36
  • `count < 2` ... this will only compare two days. – tdelaney Aug 07 '13 at 16:56
  • yes. I am starting from small set first. If code starts working i will increase further. – user2501825 Aug 07 '13 at 17:07
  • Only do os.walk() once and create a list of (date, filename) pairs. Then you can sort that list and run through the entries without turning your hard disk to dust. (hyperbole there, its cached, but still...). – tdelaney Aug 07 '13 at 17:07
  • And since I'm being naggy, consider using `'%Y-%m-%d'` for your file names so that they are easy to sort if you want to view them later. – tdelaney Aug 07 '13 at 17:10
  • since application is writing data in %m-%d-%y format I have to stick to it. I will evaluate os.walk() option. – user2501825 Aug 07 '13 at 21:00
  • @tdelaney I have used os.walk and created list of tuples which contains filename and creation date. I am planning to take date in separate list and compare it with original list to create date based archive. Is there other easy way you can recommend or this will work? – user2501825 Aug 09 '13 at 14:02
  • @user2501825 - in linux, ctime is metadata change time, not creation time. A chmod will change it. But generally, it should work. Since you want to reference by filename, consider a dict with filename as key and a tuple/list/class as value. That makes the file lookup much faster than scanning the list. Some people collect size, mtime and even a hash of the file. – tdelaney Aug 09 '13 at 15:16

1 Answers1

1
timestamp = datetime.mktime(date.timetuple())

The 'timestamp' will contain a timestamp comparable to values returned by getctime. Regarding the comment under the question: on Windows getctime returns creation time, on UNIXes modification time (http://docs.python.org/3.1/library/os.path.html).

EDIT (regarding questions in comment):

1) mktime is present in Python 2.x: http://docs.python.org/2/library/time.html#time.mktime

2) Get file creation time with Python on linux

EDIT2:

Obviously this is stupid, and one should proceed as suggested by tdelaney below:

date.fromtimestamp(filecreation)

and compare dates, not timestamps. I wasn't looking at what the algorithm was actually doing :)

Community
  • 1
  • 1
BartoszKP
  • 34,786
  • 15
  • 102
  • 130
  • is it applicable to python 2.6 also? On unix which one shall I use to get creation date? – user2501825 Aug 07 '13 at 16:44
  • 1
    I think you want to do the opposite - convert ctime into a date object. If you compare timestamp to ctime, you'll only get a match if its the same microsecond (unlikely). If you convert ctime to date, you'll match for anything in the same day. – tdelaney Aug 07 '13 at 16:54
  • Yes that is what i am looking for. Is there a way I can achieve that? filecreation variable is containing float value so if I can convert that to date then I can compare it with my base date. – user2501825 Aug 07 '13 at 17:02
  • 1
    First, don't use 'date' as a variable... it masks the date class from datetime. Then, just do `my_date = date.fromtimestamp(filecreation)`. Or, since you are using formatted strings anyway, convert it to a string `my_date = date.fromtimestamp(filecreation).strftime('%m-%d-%y')`. – tdelaney Aug 07 '13 at 17:06