4

So I have a file system that I want to be able to check and update using python. my solution was os.walk but it becomes problematic with my needs and my file system. This is how the directories are laid out:

Root
dir1
    subdir
        1
        2
        3...
    file1
    file2
dir2
    subdir
        1
        2
        3...
    file1
    file2
...

The main directories have different names hence "dir1" and "dir2" but the directories inside those have the same name as each other and contain a lot of different files and directories. The sub directories are the ones I want to exclude from os.walk as they add unnecessary computing.

Is there a way to exclude directories from os.walk based on the directory's name instead of path or will I need to do something else?

Pablo
  • 4,821
  • 12
  • 52
  • 82
Jack Bird
  • 187
  • 3
  • 13

1 Answers1

11

os.walk allows you to modify the list of directories it gives you. If you take some out, it won't descend into those directories.

for dirpath, dirnames, filenames in os.walk("/root/path"):
    if "subdir" in dirnames:
        dirnames.remove("subdir")
    # process the files here

(Note that this doesn't work if you use the bottom-up style of scanning. The top-down style is the default.)

See the documentation

kindall
  • 178,883
  • 35
  • 278
  • 309
  • Thank you for your naming convention inside the `for` loop as this make sense. I created a similar filter based on path: https://stackoverflow.com/a/51871627/1896134 – JayRizzo Aug 16 '18 at 07:34
  • If you want to skip all the sub-folders you can use `clear()` (e.g. `if len(filenames)>0: dirnames.clear()`) Notice that if you assign dirnames a value e.g. `dirnames = []` this will not work as the interpreter will create another temporary variable, which will be ignored. – Nir Apr 04 '23 at 18:01