2

I have been using os.walk() method in Python to make a list of the paths to all the folders and subfolders where a specific file can be found.

I was tired of using a bunch of loops and elifs, and packed it all into a (quite messy) list comprehension that does excacly what I want:

import os    
directory = "C:\\Users\\User\\Documents"
file_name = "example_file.txt"    
list_of_paths = [path for path in (os_tuple[0] for os_tuple in os.walk(directory) if file_name in (item.lower() for item in os_tuple[2]))]

I have two questions. The first, and most important, is: Is there a more efficiant way to do this? I often expect to find several hundred files in just as many folders, and if it's on a server it can take several minutes.

The second question is: How can I make it more readable? Having two generator comprehensions inside a list comprehension feels pretty messy.

Update: I was told to use Glob, so naturally I had to try it. It seems to work just as well as my list comprehension with os.walk(). My next step will therefore be to test the two versions on a couple of different files and folders.

import glob
directory = "C:\\Users\\User\\Documents"
file_name = "example_file.txt"
list_of_paths = [path.lower().replace(("\\" + file_name), "") for path in (glob.glob(directory + "/**/*" + file_name, recursive=True))]

Any additional comments are very welcome.

Update 2: After testing both methods, the results I'm getting suggests that the os.walk() method is about twice as fast as the glob.glob() method. The test was performed on about 400 folders with a total of 326 copies of the file I was looking for.

Kontorstol
  • 63
  • 6

0 Answers0