-1

I have a few xlsx files created today. How do I get pandas to read the latest one and convert to csv (while keeping the file name)?

milanbalazs
  • 4,811
  • 4
  • 23
  • 45

2 Answers2

1

Use:

#https://stackoverflow.com/a/39327156
list_of_files = glob.glob('/path/to/folder/*') 
latest_file = max(list_of_files, key=os.path.getctime)

df  = pd.read_excel(latest_file)

df.to_csv(os.path.basename(latest_file).split('.')[0] + '.csv', index=False)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

though the answer already been proposed but i would like to add the use of getmtime and getctime.

ctime changes when the file's ownership or permissions change, as well as when the data in the file changes. mtime changes only when the data in the file changes.

>>> import pandas as pd
>>> import os
>>> import glob

Result:

>>> latest_file = max(glob.glob('/home/karn/*.csv'), key=os.path.getmtime)
>>> latest_file
'/home/karn/test_new_2.csv'    
>>> df  = pd.read_csv(latest_file)
>>> df
   10  20  30
0  40  50  60

So, if you looking forward to pick the file which has latest data changes in the file then i'll use getmtime over getctime.

glob.glob has a limitation of not matching the files that start with a .

Karn Kumar
  • 8,518
  • 3
  • 27
  • 53