0

I'm trying to count how many file are very young, young, old and very old in a directory passed by command line. I have some struggle counting the number of file in the directory and then to change the counter in function of the age of the file.

Here is what I did yet:

import sys
import os, os.path
import time

x = 7
file_count = 0
DIR = sys.argv[1]
age = 0
age_sum = 0
nb_vy = 0
nb_y = 0
nb_o = 0
nb_vo = 0
now = time.time()

for root, subFolders, files in os.walk(DIR):
    for file in (files):
        try:
            file_count += 1
            # here I want to do some if to add 1 in the counter nb_vy/nb_y/nb_o/nb_vo but I don't know how  
            age=now - os.stat(os.path.join(root,file)).st_mtime
            age_sum+=age
            if  now-timedelta(hours=24) <= age <= now :
                nb_vy += 1
            elif now-timedelta(days=7) <= age :
                nb_y += 1
            elif now-timedelta(days=30) <= age :
                nb_o += 1
            else:
                nb_vo += 1
        except Exception:
            pass


print("Filecount = %s" % file_count)


print("--------------------------------------------------------------------\n")
print("Scanned:  "+  str(file_count) +" files\n")
print("Average age:  "+  str(age/age_sum) + "\n")
print("--------------------------------------------------------------------\n")
print("Very young files (<=  1  day) |  "+  str(nb_vy/file_count) +"% (" + str(nb_vy) + ")\n")
print("Young files      (<=  1 week) |  "+  str(nb_y/file_count) +"% (" + str(nb_v) + ")\n")
print("Old files        (<= 30 days) |  "+  str(nb_o/file_count) +"% (" + str(nb_o) + ")\n")
print("Very old files   (>  30  days |  "+  str(nb_vo/file_count) +"% (" + str(nb_vo) + ")\n")
print("--------------------------------------------------------------------\n")

How can I manage the if cascade to increment the right counter ?

J.erome
  • 688
  • 7
  • 26
  • I feel like the part who is causing trouble is the `age = now - os.stat(file).st_mtime` but I don't know why. I fixed it with `age=now - os.stat(os.path.join(root,file)).st_mtime` but it's not very clean – J.erome Dec 18 '19 at 10:09
  • `os.path.join(root,file)` is correct - else you only get the filename, and `os.stat()` then search the file in [the current working directory](https://en.wikipedia.org/wiki/Working_directory). – bruno desthuilliers Dec 18 '19 at 10:24

2 Answers2

4

You had the sides of the comparison swapped and the unneeded now - was still there. Once those are fixed and the timedelta converted to a duration of seconds for comparison:

        if  age <= timedelta(hours=24).total_seconds():
            nb_vy += 1
        elif age <= timedelta(days=7).total_seconds():
            nb_y += 1
        elif age <= timedelta(days=30).total_seconds():
            nb_o += 1
        else:
            nb_vo += 1

You should be using age < max_age_for_group as the condition. age is already now - mtime. age is in seconds.

Also except Exception: pass will harm debugging. If you must have it at least use:

except Exception:
    logging.exception('')

This will eat the exception but still print it. And then you can turn the printing off by changing the level on the root logger.

Dan D.
  • 73,243
  • 15
  • 104
  • 123
  • I have some struggle to find the max_age_for_group, I'm not used to work with time object I tried to follow [link][https://stackoverflow.com/questions/1345827/how-do-i-find-the-time-difference-between-two-datetime-objects-in-python] but I can't make it work fine – J.erome Dec 18 '19 at 10:24
  • Could you check my edit it seems like I did something wrong with comparing timestamp please @Dan D. – J.erome Dec 18 '19 at 10:39
  • I fixed like you did it but still no counter increments their value – J.erome Dec 18 '19 at 10:57
1

You correctly calculate the age of the file with calculating the difference in time using:

now = time.time()
mt = os.stat(os.path.join(root,file)).st_mtime
age = now - mt

All of the above variable are stored as float and represent the time in seconds ellapsed since January 1st 1970. So the difference age is a time difference in seconds!

In order to categorize into the intervals specified, you need to convert this time difference to days, e.g. with:

# import datetime
td = datetime.timedelta(0, age)
print(td.days) # Prints the time delta in days (integer)

I think, this is exactly what you need for you comparison, as if the time difference is, e.g. 72000 seconds (=20 hours), td.days will be evalutated to 0.

AnsFourtyTwo
  • 2,480
  • 2
  • 13
  • 33
  • I tried to use `timedelta` in my `if elif elif else` can you spot what I did wrong in ther e? – J.erome Dec 18 '19 at 10:50
  • 1
    @Dan D. pointed out, what is wrong with your comparisons. You must ensure to compare the right units of time, thus convert to either seconds or days. – AnsFourtyTwo Dec 18 '19 at 10:55