For a whitelisting scenario like this, I'd suggest using glob.iglob
to get the directories by a pattern. It's a generator, so you'll get each result as fast as it finds them (Note: At time of writing, it's still implemented with os.listdir
under the hood, not os.scandir
, so it's only half a generator; each directory is scanned eagerly, but it only scans the next directory once it's finished yielding values from the current directory). For example, in this case:
from future_builtins import filter # Only on Py2 to get generator based filter
import os.path
import glob
from operator import methodcaller
try:
from os import scandir # Built-in on 3.5 and above
except ImportError:
from scandir import scandir # PyPI package on 3.4 and below
# If on 3.4+, use glob.escape for safety; before then, if path might contain glob
# special characters and you don't want them processed you need to escape manually
globpat = os.path.join(glob.escape(path), '*', 'TARGET')
# Find paths matching the pattern, filtering out non-directories as we go:
for targetdir in filter(os.path.isdir, glob.iglob(globpat)):
# targetdir is the qualified name of a single directory matching the pattern,
# so if you want to process the files in that directory, you can follow up with:
for fileentry in filter(methodcaller('is_file'), scandir(targetdir)):
# fileentry is a DirEntry with attributes for .name, .path, etc.
See the docs on os.scandir
for more advanced usage, or you can just make the inner loop a call to os.walk
to preserve most of your original code as is.
If you really must use os.walk
, you can just be more targeted in how you prune dirs
. Since you specified all TARGET
directories should be only one level down, this is actually pretty easy. os.walk
walks top down by default, which means the first set of results will be the root directory (which you don't want to prune solely to TARGET
entries). So you can do:
import fnmatch
for i, (dirpath, dirs, files) in enumerate(os.walk(path)):
if i == 0:
# Top level dir, prune non-Project dirs
dirs[:] = fnmatch.filter(dirs, 'Project *')
elif os.path.samefile(os.path.dirname(dirpath), path):
# Second level dir, prune non-TARGET dirs
dirs[:] = fnmatch.filter(dirs, 'TARGET')
else:
# Do whatever handling you'd normally do for files and directories
# located under path/Project */TARGET/