I have a few xlsx files created today. How do I get pandas to read the latest one and convert to csv (while keeping the file name)?
Asked
Active
Viewed 491 times
2 Answers
1
Use:
#https://stackoverflow.com/a/39327156
list_of_files = glob.glob('/path/to/folder/*')
latest_file = max(list_of_files, key=os.path.getctime)
df = pd.read_excel(latest_file)
df.to_csv(os.path.basename(latest_file).split('.')[0] + '.csv', index=False)

jezrael
- 822,522
- 95
- 1,334
- 1,252
0
though the answer already been proposed but i would like to add the use of getmtime
and getctime
.
ctime
changes when the file's ownership or permissions change, as well as when the data in the file changes. mtime
changes only when the data in the file changes.
>>> import pandas as pd
>>> import os
>>> import glob
Result:
>>> latest_file = max(glob.glob('/home/karn/*.csv'), key=os.path.getmtime)
>>> latest_file
'/home/karn/test_new_2.csv'
>>> df = pd.read_csv(latest_file)
>>> df
10 20 30
0 40 50 60
So, if you looking forward to pick the file which has latest data changes in the file then i'll use getmtime
over getctime
.
glob.glob
has a limitation of not matching the files that start with a .

Karn Kumar
- 8,518
- 3
- 27
- 53