0

I am trying to return a list of filepaths that are within a range of dates created. I am relatively new to Python and almost exclusively use it within ArcMap, so I'm a little confused. I also don't have access to some of the modules that look like they would help with this on my work computer.

I have gotten as far as listing all filepaths in the folder, and sorting them by date created, with help from this page.


import os, stat, time

path = r"\\my\ImageData"
 
filepaths = [os.path.join(path, file) for file in os.listdir(path)]

print(filepaths)
 
files_sorted_by_date = []

file_statuses = [(os.stat(filepath), filepath) for filepath in filepaths]

files = ((status[stat.ST_CTIME], filepath) for status, filepath in file_statuses if stat.S_ISREG(status[stat.ST_MODE]))


for creation_time, filepath in sorted(files):
    creation_date = time.ctime(creation_time)
    files_sorted_by_date.append(creation_date + " " + filepath)

print(files_sorted_by_date)

What steps to take to only list the filepaths with "date created" that are in a date range I provide?

Also, my file paths are listed with double the \ characters (4 at the beginning and 2 between every folder) so it can't be directly pasted into windows explorer to find my files. They will eventually act as hyperlinks so they need to be correct. I could do a find and replace to change \ \ to \ but I wonder if I am doing something wrong from the start to cause this to happen.

Edit:

I am attempting to use os.walk() to search all files within subfolders in my directory.

import os, stat, time
from datetime import datetime

path = r"\\my\imagefolder"

filepaths = []

for subdir, dirs, files in os.walk(path):
    for file in files:
        filepath = os.path.join(subdir, file)
        filepaths.append(filepath)
        
print(filepaths)

#above is my attempt to search within subfolders, below is code I used from @Deo's comment

#datetime(year, month, day, hour, minute, second, microsecond)
range_start = datetime.date(datetime(2020, 3, 19))   #19th March 2020 to..
range_end = datetime.date(datetime(2021, 4, 19))    #19th April 2021

#get path only if its a file
filepaths = [os.path.join(path, file) for file in os.listdir(path) if os.path.isfile(file)]

#filter again if creation time is between the above range
filepaths = [paths for paths in filepaths if os.path.getctime(paths) > range_start and os.path.getctime(paths) < range_end]

print("\n".join(sorted(filepaths)))

print(filepaths)

The first print statement of the filepaths list after my for loop using os.walk() returns every file path in every subfolder in path. The second to last one at the end of the code returns nothing, and the last one returns an empty list. I think the two ways I am dealing with the filepaths list are incompatible, at some point the list is emptied.

If I remove the "get path only if its a file" line it returns the error TypeError: can't compare datetime.date to float.

I have confirmed that files do exist in this date range.

ElizaC
  • 1
  • 1

1 Answers1

0
import os, stat, time
from datetime import datetime

path = "\\my\\ImageData"

#datetime(year, month, day, hour, minute, second, microsecond)
range_start = datetime.timestamp(datetime(2020, 7, 1, 23, 55, 59, 0))   #1st Jul 2020 11:55:59PM to..
range_end = datetime.timestamp(datetime(2021, 8, 10, 23, 55, 59, 0))    #10th Aug 2021 11:55:59PM

#get path only if its a file
filepaths = [os.path.join(path, file) for file in os.listdir(path) if os.path.isfile(file)]

#filter again if creation time is between the above range
filepaths = [paths for paths in filepaths if os.path.getctime(paths) > range_start and os.path.getctime(paths) < range_end]

print("\n".join(sorted(filepaths)))
Dharman
  • 30,962
  • 25
  • 85
  • 135
Deo
  • 33
  • 1
  • 7
  • I got the error AttributeError: type object 'datetime.datetime' has no attribute 'timestamp' . Looked it up and apparently timestamp was added in Python 3.3. Unfortunately I'm using 2.7 – ElizaC Jul 28 '21 at 12:48
  • You don't necessarily need to convert to timestamp, you can convert it to any format that you can compare the range and creation dates – Deo Jul 28 '21 at 12:54
  • Okay, I see that I can just use date and it should work. How would I edit this script to make sure it searches within the subfolders in the path? This returned an empty list and I think it is because it's not searching in subfolders – ElizaC Jul 28 '21 at 13:23
  • You'll have to recursively search the subfolders. See this https://stackoverflow.com/questions/5817209/browse-files-and-subfolders-in-python – Deo Jul 28 '21 at 14:34
  • I'm trying to use os.walk() but it's breaking down somewhere. Edited my original post to explain my progress – ElizaC Jul 28 '21 at 15:45