0

This is the original block of code and its result:

Code:

if os.path.isdir(top):
    for root, dirs, files in os.walk(top, topdown = True):
        for dirname in dirs:
            print 'Dirname = ', os.path.join(root, dirname)

Results:

Dirname = ../output/.svn
Dirname = ../output/a random folder
Dirname = ../output/a random folder - copy
Dirname = ../output/.svn\pristine
Dirname = ../output/.svn\temp
Dirname = ../output/.svn\pristine\04
Dirname = ../output/.svn\pristine\59
Dirname = ../output/a random folder\another one inside
Dirname = ../output/a random folder\another one inside - Copy
Dirname = ../output/a random folder\another one inside - Copy (2)

Now I want to ignore all hidden folders and subfolders. This is the modified code and its result:

Code:

if os.path.isdir(top):
    for root, dirs, files in os.walk(top, topdown = True):
        for dirname in dirs:
            print 'Dirname = ', os.path.join(root, dirname)
            if dirname.startswith('.'):
                dirs.remove(dirname)

Result:

Dirname = ../output/.svn
Dirname = ../output/a random folder - copy
Dirname = ../output/a random folder\another one inside
Dirname = ../output/a random folder\another one inside - Copy
Dirname = ../output/a random folder\another one inside - Copy (2)

What I don't understand is: why is ../output/a random folder not listed anymore??

cinny
  • 2,292
  • 3
  • 18
  • 23

1 Answers1

6

You should not modify an iterable while you're iterating over it. In this case, you're modifying dirs inside of a for loop that iterates over dirs.

Try this instead:

if os.path.isdir(top):
    for root, dirs, files in os.walk(top, topdown = True):
        dirs_to_ignore = []
        for dirname in dirs:
            print 'Dirname = ', os.path.join(root, dirname)
            if dirname.startswith('.'):
                dirs_to_ignore.append(dirname)
        for dirname in dirs_to_ignore:
            dirs.remove(dirname)

See also: Modifying list while iterating

Community
  • 1
  • 1
Steven T. Snyder
  • 5,847
  • 4
  • 27
  • 58
  • 1
    Your diagnosis is correct, but I think you've missed the point of the original code. It really does mean to remove items from `dirs` so that `os.walk` won't recurse into those directories. A more correct fix would be to copy `dirs` and iterate over the copy while removing items from the original. – Weeble Feb 29 '12 at 18:40
  • I changed my code example to reflect the intent of the original code. I did it by keeping a temporary list of directories to ignore, and then pruning them from `dirs`. Weeble's suggestion of iterating over a copy and removing from the original should also work. – Steven T. Snyder Feb 29 '12 at 18:52
  • 2
    Making a `dirs_to_keep` list followed by `dirs[:] = dirs_to_keep` would be shorter and more efficient, though performance really does not matter here. – Jochen Ritzel Feb 29 '12 at 19:07
  • Jochen, can you give an example? I don't really get your comment. – cinny Feb 29 '12 at 20:19