0

I try to rename files by the dictionary value according to the keywords(key) I have. The old name of the files is a long string containing the keywords(key) not exactly the same!! I want to find the key included in the file name and rename the file by the corresponding value. The value should be the new name for all files. The dictionary structure would look like the table below:

Dictionary name: nameKeyWords

Key (Keywords) Value (Name)
abb 1
ave 2
asp 3

Below is the code I wrote, and it does work. However, the code is very inefficient because I use three for loop to go through all the files, keywords(keys) in the dictionary, and all the file_name in file_names. Is there any method that can make the code more efficient? Thanks!

for (dir_path, dir_names, file_names) in walk(dir_path):
    for file_name in file_names:
        for keyWords in nameKeyWords:
            if keyWords in file_name:
                old_name = os.path.join(dir_path,file_name)
                new_name = os.path.join(dir_path,nameKeyWords.get(keyWords)+'.csv')
                os.rename(old_name, new_name)
            else:
                print(file_name)
Driftr95
  • 4,572
  • 2
  • 9
  • 21

1 Answers1

0

I don't know anyway to get all the file_names without nested for loops, but you should break after os.rename(old_name, new_name), because there's no point in renaming the same file multiple times (and wouldn't it raise FileNotFoundError after the first renaming since there will no longer be a file named file_name in that directory?). And also, using for...else (instead of if...else inside for keyWords...) would keep the same file_name from being printed multiple times.

for (dir_path, dir_names, file_names) in walk(dir_path):
    for file_name in file_names:
        for keyWords in nameKeyWords:
            if keyWords in file_name:
                old_name = os.path.join(dir_path,file_name)
                new_name = os.path.join(dir_path,nameKeyWords.get(keyWords)+'.csv')
                os.rename(old_name, new_name)
                break ## [ no need to keep checking ]
        else: print(file_name) ## [ only prints if for never breaks ]


If you just want to decrease the level of nesting in your innermost loop, you can separate the loops:

fNameGenerator = (
    (dPath, fName) for dPath, dnames, fNames
    in os.walk(dir_path) for fName in fNames
)

for dPath, file_name in fNameGenerator:
    for keyWords in nameKeyWords:
        if keyWords in file_name:
            new_name = os.path.join(dPath, f'{nameKeyWords.get(keyWords)}.csv')
            os.rename(os.path.join(dPath, file_name), new_name)
            break
    else: print(file_name)

You could also get new_names within fNameGenerator

# nameKeyWords = {....}
def getNewFn(oldFn:str):
    for k in nameKeyWords:
        if k in oldFn: return f"{nameKeyWords[k]}.csv"

fNameGenerator = (
    (dPath, fName, getNewFn(fName)) for dPath, dnames, fNames 
    in os.walk(dir_path) for fName in fNames
)

for dPath, file_name, new_name in fNameGenerator:
    if new_name is None: print(file_name)
    else: os.rename(*[os.path.join(dPath, fn) for fn in [file_name, new_name]]) 


Please note that none of these have decreased the complexity, and the nested loops might actually be the fastest alternative [although none of them seem take up significantly less time that the rest].

Driftr95
  • 4,572
  • 2
  • 9
  • 21
  • Thanks, this is exactly what I am looking for. I am a python self-learning new starter, can you explain how the 'fNameGenerator' works? I am still very confused about how 'fNamegenerator' works after I did some research about the generator functions. – Victor Pan Feb 09 '23 at 10:28
  • @VictorPan It's a generator expression for a [flattened list that used to be nested](https://stackoverflow.com/a/25674934/6146136). If you use square brackets (like `[(...,fName,...) for ...,fNames in os.walk(dir_path) for fName in fNames]`) it becomes [list comprehension](https://www.w3schools.com/python/python_lists_comprehension.asp). Using the generator is more memory-efficient (but not necessarily faster) so I don't think it'll make much of a difference unless you have LOTS of files. Read [this blog](https://codete.com/blog/python-basics-generator-expressions-and-comprehensions) for more – Driftr95 Feb 09 '23 at 12:50