0

I have multiple .xlsx files placed in a folder. How can I read the latest file having today's date (modified date) in PYTHON? and further store the file name in a variable.

  • Does this not solve your problem? https://stackoverflow.com/questions/237079/how-to-get-file-creation-modification-date-times-in-python – Andrew-Harelson Oct 01 '21 at 06:55
  • Not really. This is my code where I am hard coding the excel name manually: import pyodbc as odbc import pandas as pd import yaml as yl df = pd.read_excel(r'd:\Automate\2021927893014613.xlsx',sheet_name=None) mdf = pd.concat(df, axis=0, ignore_index=True) – Mariam Mehdi Oct 01 '21 at 07:04

3 Answers3

1
import os

# list all .xlsx files in absolute directory
files = (os.path.abspath(file) for file in os.listdir('/path/to/PYTHON') if file.endswith('.xlsx'))

# get their last updated time
files_and_updated_time = ((file, os.path.getmtime(file)) for file in files)

# sort out the lastest updated xlsx
last_updated_xlsx = sorted(files_and_updated_time, key=lambda x: x[1], reverse=True)

# check if this said xlsx exists
# if so, store its absolute path in `result`
if last_updated_xlsx:
    result = last_updated_xlsx[0][0]
else:
    result = None
Brandon
  • 708
  • 6
  • 13
  • what is this parameter that you are trying to pass in os.listdir? os.listdir('/path/to/PYTHON') – Mariam Mehdi Oct 01 '21 at 07:11
  • I mean, it is your "PYTHON" directory, but in absolute path. @MariamMehdi – Brandon Oct 01 '21 at 07:16
  • Oh sorry, I misunderstood. Just put in your folder's absolute path :D – Brandon Oct 01 '21 at 07:17
  • Getting this error: FileNotFoundError: [WinError 2] The system cannot find the file specified: 'C:\\Users\\mariam\\2021927893014613.xlsx' – Mariam Mehdi Oct 01 '21 at 07:26
  • Why is it reading file in C folder when I have mentioned my folder's path: files = (os.path.abspath(file) for file in os.listdir('D:\Automate') if file.endswith('.xlsx')) # get their last updated time files_and_updated_time = ((file, os.path.getmtime(file)) for file in files) # sort out the lastest updated xlsx last_updated_xlsx = sorted(files_and_updated_time, key=lambda x: x[1], reverse=True) if last_updated_xlsx: result = last_updated_xlsx[0][0] else: result = None df = pd.read_excel(result,sheet_name=None) – Mariam Mehdi Oct 01 '21 at 07:27
  • Could you show me what you get when running `[os.path.abspath(file) for file in os.listdir('/path/to/PYTHON') if file.endswith('.xlsx')]` ? I'm not sure if different os did something different. – Brandon Oct 01 '21 at 07:41
  • The equivalent one liner should be `result = sorted([(os.path.abspath(file), os.path.getmtime(file)) for file in os.listdir('/path/to/directory') if file.endswith('.xlsx')], key=lambda x: x[1], reverse=True)[0][0]`, without checking existence of that file. – Brandon Oct 01 '21 at 07:45
1
from pathlib import Path

# Save all .xlsx files paths and modification time into paths
paths = [(p.stat().st_mtime, p) for p in Path("path/to/folder").iterdir() if p.suffix == ".xlsx"]

# Sort them by the modification time
paths = sorted(paths, key=lambda x: x[0], reverse=True)

# Get the last modified file
last = paths[0][1]

Note that last is of type Path. If you want it as a string you can change the last line to

last = str(paths[0][1])
Thomas Q
  • 850
  • 4
  • 10
  • Hey, how can I get the file name only that is being stored in variable 'last'? Last variable contains complete file path, what if i only want to extract file name and store it in a variable? – Mariam Mehdi Oct 01 '21 at 13:10
  • then you need to change the last line to `last = paths[0][1].name` – Thomas Q Oct 01 '21 at 23:03
0

Not sure what OS are you on, but I have the solution for Linux/Raspberry Pi. Maybe only a little modification is needed for this solution to be implemented on Windows.

Importing libraries:

  • To get today's date, use the datetime library.
  • To find a file in a specified directory or folder, use the os library.

Now, for the coding part:

import os
from datetime import datetime

# This gets today's date in string with the format of 2021-12-31.
today = datetime.today().strftime('%Y-%m-%d')

# Assume your filename is only today's date and the extension at the end.
path = '/home/pi/' + today + '.xlsx'

# If there's a same filename found in this path.
if os.path.exists(path):
    print("File found")

    # Gets the filename from the path and store it in the 'filename' variable.
    filename = os.path.basename('/home/pi/' + path + '.xlsx')
    
else:
    print("File not found")
Shawn Khoo
  • 21
  • 5