I have a folder in which my python program generated text files (data in CSV format) are stored. I want to read 3 files (File Name Starts with LogFile_Date) into Pandas Dataframe with latest modified time. I am using Windows Operating System and Python 3.
Asked
Active
Viewed 1,232 times
0
-
1Where should the last modified time be stored? As a column with the time in each row? And please show what you have tried. – ksbg Aug 20 '18 at 09:05
-
No need to store the modified time. I just want to read the Data of those files into Pandas Dataframe. Don't know how to read files with latest modified time. – Pravat Aug 20 '18 at 09:21
-
Pravat you want to modified time because its included in the name of the files ??? like the name of the file is LogFile_Modified time???? – Inder Aug 20 '18 at 09:32
2 Answers
2
Helped by this: How do you get a directory listing sorted by creation date in python?. I think this is what you want:
import os
import pandas as pd
search_dir = r"C:\mydir"
os.chdir(search_dir)
files = filter(os.path.isfile, os.listdir(search_dir))
files = [os.path.join(search_dir, f) for f in files] # add path to each file
files.sort(key=lambda x: os.path.getmtime(x), reverse=True)
dfs=[]
for i in range(3):
dfs.append(pd.read_csv(files[i].split('\\')[-1],
delimiter=','))

Mateo Rod
- 544
- 2
- 6
- 14
-
Sir, Getting error as: 'NoneType' object has no attribute 'append' and I want read those files which starts with "LogFile_" – Pravat Aug 20 '18 at 10:04
-
1Sorry, I had a mistake in my code. If you only have "LogFile_" files in your directory i think this work. – Mateo Rod Aug 20 '18 at 10:32
0
import os
import pandas as pd
search_dir = r"C:\Users\123\Documents\Folder"
os.chdir(search_dir)
files = filter(os.path.isfile, os.listdir(search_dir))
files = [os.path.join(search_dir, f) for f in files] # add path to each file
files.sort(key=lambda x: os.path.getmtime(x), reverse=True)
dfs = pd.DataFrame()
for i in range(2):
dfs = dfs.append(pd.read_csv(files[i].split('\\')[-1],delimiter=',', header=None, usecols=[0,1,2], names=['colA', 'colB', 'colC']))
dfs = dfs.reset_index(drop=True)
print(dfs)

Pravat
- 329
- 2
- 17