-1

I have a requirement where I need to process the latest file version when there are multiple versions in a folder. When there are no multiple versions it can process all files.

Example:

File1: ABC2021Q1.01.txt
File2: ABC2021Q1.02.txt
File3: ABC2021Q1.03.txt
File4: CDE2021Q2.01.txt
File5: CDE2021Q3.02.txt

Files to be processed:


File3: ABC2021Q1.03.txt
File4: CDE2021Q2.01.txt
File5: CDE2021Q3.02.txt
Sandysql
  • 11
  • 5

2 Answers2

1

Use

import glob
import os

files = glob.glob('/path/to/folder/*')
latest_file = max(files, key=os.path.getctime)
print(latest_file)
Fatin Ishrak Rafi
  • 339
  • 1
  • 3
  • 11
  • Should probably be noted that this works so long as the modification time is such that the newest version _always_ is the most recently updated. Otherwise, OP may need something that sorts the files by both modification time and file name (probably reverse-ordered) – rossdrucker9 Aug 31 '21 at 02:43
  • will this pick the latest of all the files from the folder or latest file from multiple files with same filename? – Sandysql Aug 31 '21 at 03:06
  • all i need is to group the files with same name except the version and pick the max version file. – Sandysql Aug 31 '21 at 03:10
  • it will pick the latest of all files from the folder and also if there are multiple files with the same name it will pick the latest version which is recently updated or modified. @Sandysql – Fatin Ishrak Rafi Aug 31 '21 at 14:40
0

data:

File1: ABC2021Q1.01.txt 
File2: ABC2021Q1.02.txt 
File3: ABC2020Q1.03.txt 
File4: CDE2021Q2.01.txt 
File5: CDE2020Q3.02.txt 

code:

#encoding=utf-8
import os
import re
def filter_file(path):
    result_dict = {}
    for file in os.listdir(path):
        result = re.fullmatch(r'(.*)(\d{4}Q\d\.\d{2}).txt', file)
        if result:
            name, date = result.groups()
            result_dict[name] = max(result_dict.get(name, '199101.01'), date)
    for name, date in result_dict.items():
        print('{}{}.txt'.format(name, date))

if __name__ == '__main__':
    filter_file(r'C:\Users\dqgu')

result:

ABC2020Q1.03.txt
CDE2021Q2.01.txt
CDE2020Q3.02.txt
DongQing
  • 11
  • 2
  • i have updated the actual file format , can you look in to it? – Sandysql Sep 01 '21 at 01:10
  • Hi , I have a below format files File1: ABC2021Q1.01.txt File2: ABC2021Q1.02.txt File3: ABC2020Q1.03.txt File4: CDE2021Q2.01.txt File5: CDE2020Q3.02.txt Output: File3: ABC2020Q1.03.txt File4: CDE2021Q2.01.txt File5: CDE2020Q3.02.txt – Sandysql Sep 01 '21 at 21:06
  • this is not getting my desired output. It is always selecting the max of the list of the files. – Sandysql Sep 03 '21 at 11:18