0

** Problem ** I'm trying to open (in python) files older than 3 days of the date stamp which is in the current name. Example: 2016_08_18_23_10_00 - JPN - MLB - Mickeymouse v Burgerface.ply. So far I can create a date variable, however I do not know how to search for this variable in a filename. I presume I need to convert it to a string first?

from datetime import datetime, timedelta
import os
import re
path = "C:\Users\michael.lawton\Desktop\Housekeeper"

## create variable d where current date time is subtracted by 3 days ##

days_to_subtract = 3
d = datetime.today() - timedelta(days=days_to_subtract)

print d

## open file in dir where date in filename = d or older ##

for filename in os.listdir(path):
if re.match(d, filename):
    with open(os.path.join(path, filename), 'r') as f:
        print line,

Any help will be much appreciated

user6705306
  • 117
  • 13
  • Side-note: Use raw strings for Windows paths (and regular expressions for that matter). It didn't bite you this time (you got lucky), but it will, eventually, when you have a path where a file or directory in it has a name starting with, for example `b`, `f`, `n`, etc. (getting you embedded backspace, form feed or newline, respectively). Just put an `r` in front of the literal (and don't end it with a backslash), e.g. `path = r"C:\Users\michael.lawton\Desktop\Housekeeper"` and this can't happen by accident. – ShadowRanger Sep 21 '16 at 10:53
  • Thanks for that, duly noted! – user6705306 Sep 21 '16 at 13:31
  • if you want to open files that are 3 days older and not a minute younger then you have to take into account the local timezone. See [Find if 24 hrs have passed between datetimes - Python](http://stackoverflow.com/q/26313520/4279) – jfs Sep 21 '16 at 17:05
  • @ShadowRanger: a nitpick: there are no "raw strings" in memory there are only "raw string literals" in the source code. Otherwise, it is a good recomendation to use `r''` for Windows paths (to avoid escaping the backslashes). – jfs Sep 21 '16 at 17:05

2 Answers2

0

You can use strptime for this. It will convert your string (assuming it is correctly formatted) into a datetime object which you can use to compare if your file is older than 3 days based on the filename:

from datetime import datetime

...

lines = []
for filename in os.listdir(path):
  date_filename = datetime.strptime(filename.split(" ")[0], '%Y_%m_%d_%H_%M_%S')
  if date_filename < datetime.datetime.now()-datetime.timedelta(days=days_to_subtract):
    with open(os.path.join(path, filename), 'r') as f:
      lines.extend(f.readlines()) # put all lines into array

If the filename is 2016_08_18_23_10_00 - JPN - MLB - Mickeymouse v Burgerface.ply the datetime part will be extracted with filename.split(" ")[0]. Then we can use that to check if it is older than three days using datetime.timedelta

Linus
  • 1,516
  • 17
  • 35
  • 1
    Thanks. What is the integer division for at the end? `// open file` – user6705306 Sep 21 '16 at 09:28
  • Sorry that is soppused to be a comment, I'm used to coding in C where // means a comment :) – Linus Sep 21 '16 at 09:31
  • Can't read the file, need buffer, however I don't understand buffer, could you explain? TypeError: coercing to Unicode: need string or buffer, datetime.datetime found >>> – user6705306 Sep 21 '16 at 10:28
  • `for filename in os.listdir(path): date_filename = datetime.datetime.strptime(filename.split(" ")[0], '%Y_%m_%d_%H_%M_%S') if date_filename < datetime.datetime.now()-datetime.timedelta(days=days_to_subtract): #this works# file = open(date_filename,'r') print file.read()` – user6705306 Sep 21 '16 at 10:30
  • You are trying to read from the `date_filename` variable (which is a datetime object) you really wanna read from the filename variable (see my edited answer). – Linus Sep 21 '16 at 10:48
  • Ah, thank you, I see it now, silly mistake. I have one issue left which I don't understand at all - ValueError: mode string must begin with one of 'r', 'w', 'a' or 'U'. `str(filename) file = open(path,filename) print file.read()` – user6705306 Sep 21 '16 at 11:36
  • Got it, I used os.path.join(path,filename)) Thank you for all your help! – user6705306 Sep 21 '16 at 12:30
  • @user6705306 glad to help. Please mark my answer as the best answer if it solved your problem. That way others can know your problem was solved. – Linus Sep 21 '16 at 12:37
  • Cheers and marked. I've ran into one more problem. As I don't understand this fully yet but need to solve this quickly - How enter all files with the naming convention that meets the `datetime.datetime.now()-datetime.timedelta(days=days_to_subt‌​ract)` into the variable `filename` within the 'path' ? – user6705306 Sep 21 '16 at 12:58
  • @user6705306 Do you want to put all the matching files into an array? – Linus Sep 21 '16 at 13:06
  • @user6705306 see my edited answer. It adds all the lines to a single array using [extend](https://www.tutorialspoint.com/python/list_extend.htm). If you need any more help I advice putting up a new question appropriate for that. – Linus Sep 21 '16 at 16:58
0

To open all files in the given directory that contain a timestamp in their name older than 3 days:

#!/usr/bin/env python2
import os
import time

DAY = 86400 # POSIX day in seconds
three_days_ago = time.time() - 3 * DAY
for filename in os.listdir(dirpath):
    time_string = filename.partition(" ")[0]
    try:
        timestamp = time.mktime(time.strptime(time_string, '%Y_%m_%d_%H_%M_%S'))
    except Exception: # can't get timestamp
        continue
    if timestamp < three_days_ago: # old enough to open
        with open(os.path.join(dirpath, filename)) as file: # assume it is a file
            for line in file:
                print line,

The code assumes that the timestamps are in the local timezone. It may take DST transitions into account on platforms where C mktime() has access to the tz database (if it doesn't matter whether the file is 72 or 73 hours old in your case then just ignore this paragraph).

Consider using file metadata such as "the last modification time of a file" instead of extracting the timestamp from its name: timestamp = os.path.getmtime(path).

jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • Thank you both. I have gone with the first answer due to date stamps and date modified's being different. Really appreciate the help. – user6705306 Sep 22 '16 at 09:54