0

I have a folder with csv files generated every 1 minute interval. I want to filter the files which have arrived before a particular time (example 12:15 PM). My code is available below:

    import os
    import pandas as pd
    
    search_dir = r"C:\Users\123\Documents\Folder"
    os.chdir(search_dir)
    files = filter(os.path.isfile, os.listdir(search_dir))
    files = [os.path.join(search_dir, f) for f in files] # add path to each file
    files = files.sort(key=lambda x: os.path.getmtime(x), reverse=True)

Here I have list of files sorted as per last modified time. Any help how to filter files which have arrived before a particular time.

Pravat
  • 329
  • 2
  • 17

1 Answers1

0

Have you already checked this answer python filter files by modified time? Your requirement should be a slight modification to this.

import os
import pandas as pd
from datetime import datetime
from pathlib import Path

search_dir = r"C:\Users\123\Documents\Folder"
os.chdir(search_dir)
files = filter(os.path.isfile, os.listdir(search_dir))
files = [os.path.join(search_dir, f) for f in files] # add path to each file

Till here, your code remains the same. I am not very sure why you need to sort your files as per the time if you will filter them later on. However, assuming this is a necessary step, I have changed the last line as it results in a NoneType result. Instead I use the pathlib library to sort the files as you wanted there. So replace the last line with the following line.

files_sorted = sorted(Path(search_dir).iterdir(), key=os.path.getmtime)

You haven't specified whether your filter time is user provided or a time stamp from a file. If it is a time stamp from a file proceed by calling the time stamp of that file. For example, I take the time of the first file from the sorted file list.

particular_time = os.path.getmtime(files_sorted[0])

Following this, assuming that you want to remove all the files that have times that are lower than the particular time (you didn't clearly mention what you want there again), do the following:

for f in files_sorted:
    tLog = os.path.getmtime(f)
    print("checking ", f, datetime.fromtimestamp(tLog))

    if particular_time > tLog:
        print("filter out the files", f)
        files_sorted.remove(f)
Shaan
  • 141
  • 7
  • Yea I have used that but getting error: TypeError: '>' not supported between instances of 'str' and 'float'. Any help using lambda. – Pravat Mar 24 '21 at 11:24
  • Hi, I didn't check your code properly. Remove the last line from your code and update the modifications in mine. I'll update accordingly. – Shaan Mar 24 '21 at 12:08
  • Anyway we can do using list-comprehension ? – Pravat Mar 24 '21 at 15:14
  • The above modified edit should work fine. You haven't been very clear with what exactly you want, so under certain assumptions, I have implemented the code. As far as list comprehensions go, I don't think it's anything special other than putting a loop in a line. You are free to try that out. But hopefully, this suffices your requirement. Please mark as answered if this does the job or you are welcome to create another question which is more specific. – Shaan Mar 24 '21 at 17:34