0

I have a folder in which my python program generated text files (data in CSV format) are stored. I want to read 3 files (File Name Starts with LogFile_Date) into Pandas Dataframe with latest modified time. I am using Windows Operating System and Python 3.

Pravat
  • 329
  • 2
  • 17
  • 1
    Where should the last modified time be stored? As a column with the time in each row? And please show what you have tried. – ksbg Aug 20 '18 at 09:05
  • No need to store the modified time. I just want to read the Data of those files into Pandas Dataframe. Don't know how to read files with latest modified time. – Pravat Aug 20 '18 at 09:21
  • Pravat you want to modified time because its included in the name of the files ??? like the name of the file is LogFile_Modified time???? – Inder Aug 20 '18 at 09:32

2 Answers2

2

Helped by this: How do you get a directory listing sorted by creation date in python?. I think this is what you want:

import os
import pandas as pd

search_dir = r"C:\mydir"
os.chdir(search_dir)
files = filter(os.path.isfile, os.listdir(search_dir))
files = [os.path.join(search_dir, f) for f in files] # add path to each file
files.sort(key=lambda x: os.path.getmtime(x), reverse=True)
dfs=[]
for i in range(3):
    dfs.append(pd.read_csv(files[i].split('\\')[-1],
                           delimiter=','))
Mateo Rod
  • 544
  • 2
  • 6
  • 14
  • Sir, Getting error as: 'NoneType' object has no attribute 'append' and I want read those files which starts with "LogFile_" – Pravat Aug 20 '18 at 10:04
  • 1
    Sorry, I had a mistake in my code. If you only have "LogFile_" files in your directory i think this work. – Mateo Rod Aug 20 '18 at 10:32
0
import os
import pandas as pd

search_dir = r"C:\Users\123\Documents\Folder"
os.chdir(search_dir)
files = filter(os.path.isfile, os.listdir(search_dir))
files = [os.path.join(search_dir, f) for f in files] # add path to each file
files.sort(key=lambda x: os.path.getmtime(x), reverse=True)
dfs = pd.DataFrame()
for i in range(2):
    dfs = dfs.append(pd.read_csv(files[i].split('\\')[-1],delimiter=',', header=None, usecols=[0,1,2], names=['colA', 'colB', 'colC']))

dfs = dfs.reset_index(drop=True)
print(dfs)
Pravat
  • 329
  • 2
  • 17