0

I have a directory containing many subdirectories. Here is my code:

def organiseData():
    for root, dirs, files in os.walk(directory_name):
        for dirName in sorted(dirs):
            print(root + '/' + dirName)
            if not os.path.exists(root + '/' + dirName + '/xml'):
                os.mkdir(root + '/' + dirName + '/xml')

It creates the directory inside each subdirectory okay, but then in the first subdirectory in the list it keeps making xml dirs inside xml dirs recursively until:

Sequence_3525/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml/xml
Traceback (most recent call last):
File "organiseData.py", line 74, in <module>
organiseData()
File "organiseData.py", line 8, in organiseData
for root, dirs, files in os.walk(directory_name):

RecursionError: maximum recursion depth exceeded
Nermona
  • 137
  • 1
  • 1
  • 5

2 Answers2

3

The dirs value yielded from os.walk represents the list of directories that are traversed in the following iterations (see this for an example of why that's useful).

Each time you create a new xml folder in a directory from dirs, os.walk later enters your new directory and creates more xml folders since the ones you created are empty.

Try making the check at the root level instead. For example:

def organize_data(directory_name):
    for root, subdirs, filenames in os.walk(directory_name):
        for sub in sorted(subdirs):
            print(os.path.join(root, sub))
        xml_dir = os.path.join(root, 'xml')
        if not os.path.exists(xml_dir):
            os.mkdir(xml_dir)

Another way to get around this is to prune dirs by removing all directories that you've already created xml folders in. Like this:

def organize_data(directory_name):
    for root, dirs, filenamess in os.walk(directory_name):
        for dir_name in sorted(dirs):
            print(os.path.join(root, dir_name))
            xml_dir = os.path.join(root, dir_name, 'xml')
            if not os.path.exists(xml_dir):
                os.mkdir(xml_dir)
                dirs.remove(dir_name) # Prevents infinite loop
kingkupps
  • 3,284
  • 2
  • 16
  • 28
2

Directories are traversed from top down by default. Since you create directories in subdirectories, you keep finding more directories. Search bottom-up instead with:

os.walk(...,topdown=False)

Since the search will start in the deepest directories, adding a deeper xml directory won't affect the iteration.

Quote from the documentation:

If optional argument topdown is True or not specified, the triple for a directory is generated before the triples for any of its subdirectories (directories are generated top-down). If topdown is False, the triple for a directory is generated after the triples for all of its subdirectories (directories are generated bottom-up). No matter the value of topdown, the list of subdirectories is retrieved before the tuples for the directory and its subdirectories are generated.

When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting, or even to inform walk() about directories the caller creates or renames before it resumes walk() again. Modifying dirnames when topdown is False has no effect on the behavior of the walk, because in bottom-up mode the directories in dirnames are generated before dirpath itself is generated.

Also FYI, os.path.join() is a cleaner way to join directories together, e.g. os.path.join(root,dirname,'xml').

Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251