0

Not sure if logical is the right word here. However, when I run os.walk I'm appending paths to a list, and I would like the order to be so that if you were to read top to bottom, it would make sense.

For example, if the path I was looping through was C:\test which has a single file, along with folders (each with their own subfolders and files), this is what I'd want the list output to resemble.

C:\test
C:\test\test1.txt
C:\test\subfolder1
C:\test\subfolder1\file1.txt
C:\test\subfolder1\file2.txt
C:\test\subfolder2
C:\test\subfolder2\file3.txt

However, my output is the following.

C:\test\subfolder1
C:\test\subfolder2
C:\test\test1.txt
C:\test\subfolder1\file1.txt
C:\test\subfolder1\file2.txt
C:\test\subfolder2\file3.txt

First problem is that C:\test doesn't appear. I could just append C:\test to the list. However, I would want C:\test\test1.txt to appear directly below it. Ordering ascendingly would just stick it at the end.

When using os.walk is there a way for me to append to my list in such as way that everything would be in order?

Code:

import os
 
tree = []
for root, dirs, files in os.walk(r'C:\test', topdown = True):

    for d in dirs:
       tree.append(os.path.join(root, d))

    for f in files:
        tree.append(os.path.join(root, f))

for x in tree:
    print(x)

Edit: By logical order I mean I would like it to appear as top folder, followed by subfolders and files in that folder, files and subfolders in those folders, and so on.

e.g.

C:\test
    C:\test\test1.txt
    C:\test\subfolder1
       C:\test\subfolder1\file1.txt
       C:\test\subfolder1\file2.txt
    C:\test\subfolder2
       C:\test\subfolder2\file3.txt
  • What would "make sense" when reading an inherently 2D data structure in one dimension (top to bottom)? Do you understand that "makes sense" is pretty subjective? Some people might prefer breadth-first, others depth-first. Some might prefer to read alphabetically and ignore the structure entirely. Or directories first then files. Or by date created, or modified... Please explain what *exactly* you want here because right now it's not clear what you mean by "in order". – ddejohn Jun 23 '22 at 19:03
  • The intention of the output snippet was to show what I mean by "in order" and what "makes sense". However, I've updated to add more details on what I mean here. –  Jun 23 '22 at 19:16
  • https://stackoverflow.com/questions/16953842/using-os-walk-to-recursively-traverse-directories-in-python – ddejohn Jun 23 '22 at 19:19

2 Answers2

1

The order you want is the order in which os.walk iterates over the folders. You just have to append root to your list, instead of the folders.

import os
 
tree = []
for root, _, files in os.walk('test'):
    tree.append(root)
    for f in files:
        tree.append(os.path.join(root, f))

for x in tree:
    print(x)

Output

test
test/test1.txt
test/subfolder1
test/subfolder1/file1.txt
test/subfolder1/file2.txt
test/subfolder2
test/subfolder2/file3.txt
nonDucor
  • 2,057
  • 12
  • 17
0

This code should solve your problem

Explanation:

  1. Loop through your tree variable and create a list of tuples where first element of the tuple is the dir/file path and second element is the count of \ in the dir/file path

  2. Sort the list of tuples by the second element of the tuple

  3. Create a list of the second elements of the tuples in your list

import os
 
tree = []
for root, dirs, files in os.walk('C:\\test', topdown = True):

    for d in dirs:
       tree.append(os.path.join(root, d))

    for f in files:
        tree.append(os.path.join(root, f))

tup = []

def Sort_Tuple(tup):
    """function code sourced from https://www.geeksforgeeks.org/python-program-to-sort-a-list-of-tuples-by-second-item/"""
    # getting length of list of tuples
    lst = len(tup)
    for i in range(0, lst):
         
        for j in range(0, lst-i-1):
            if (tup[j][1] > tup[j + 1][1]):
                temp = tup[j]
                tup[j]= tup[j + 1]
                tup[j + 1]= temp
    return tup

for x in tree:
    i = x.count('\\')
    tup.append((x,i))
sorted = Sort_Tuple(tup)

answer = [tup[0] for tup in sorted]

print(answer)

This should work. In no way shape or form this is optimized.

Tangobones
  • 25
  • 4