Supposing that our rule for exclusion is "the path matches any
of the EXCLUSIONS
, per the logic of fnmatch.fnmatch
", we can write a function to encapsulate that:
def should_exclude(path):
return any(fnmatch.fnmatch(path, exclude) for exclude in EXCLUSIONS)
(We could generalize that by accepting the exclusions as the first parameter instead of relying on the global, and then binding it using functools.partial
.)
The way to make os.walk
stay out of pruned directories is to walk top-down (the default) and modify the yielded lists of sub-directories in-place. We want to apply a rule iteratively to the list while also modifying it, which is tricky; the most elegant way I can think of is to use a list comprehension to create the modified version, and then slice it back into place:
def filter_inplace(source, exclude_rule):
source[:] = [x for x in source if not exclude_rule(x)]
(Note the generalization here; we are expected to pass the filtering predicate, should_exclude
, as an argument.)
Now we should be able to use os.walk
as documented:
def filtered_walk(root):
for subroot, dirs, files in os.walk(root):
yield subroot, files # the current result
filter_inplace(dirs, should_exclude) # set up filtered recursion
This can be varied in multiple ways depending on your exact requirements. For example, you could iterate over the files
and os.path.join
them to the subroot
, yield
ing each result separately. It's worth playing around a bit and debugging, to make sure you understand exactly what subroot
, dirs
and files
look like at each step of the iteration, and verifying that the filtering gives the results you expect.