Can os.walk() be used on a list of strings? If not, what would be the equivalent?

Question

Instead of using a real file structure, is it possible to give it a premade list of strings to have it create a nested list of paths/files for you?

Example List (formatted so it's readable):

files = [
    'user/hey.jpg',
    'user/folder1/1.txt',
    'user/folder1/folder2/random.txt',
    'user/folder1/blah.txt',
    'user/folder3/folder4/folder5/1.txt',
    'user/folder3/folder4/folder5/3.txt',
    'user/folder3/folder4/folder5/2.txt',
    'user/1.jpg'
    ]

Here is the output I'm hoping for:

['user'['1.jpg','hey.jpg','folder1'['1.txt','blah.txt','folder2'['random.txt']],'folder3'['folder4'['folder5'['1.txt','2.txt','3.txt']]]]]

I asked a similar question here: How would I nest file path strings into a list based upon matching folder paths in Python

I got a comment saying os.walk() would be ideal for what I wanted, but everywhere I look, it seems that people are using it to crawl the file system, as opposed to passing it a list of already created file paths.

BartoszKP · Accepted Answer · 2017-02-04T11:13:47.510

2

Just create a tree. Quick draft to show an example:

import pprint

files = [
    'user/hey.jpg',
    'user/folder1/1.txt',
    'user/folder1/folder2/random.txt',
    'user/folder1/blah.txt',
    'user/folder3/folder4/folder5/1.txt',
    'user/folder3/folder4/folder5/3.txt',
    'user/folder3/folder4/folder5/2.txt',
    'user/1.jpg'
    ]

def append_to_tree(node, c):
    if not c:
        return

    if c[0] not in node:
        node[c[0]] = {}

    append_to_tree(node[c[0]], c[1:])

root = {}
for path in files:
    append_to_tree(root, path.split('/'))

pprint.pprint(root)

Output:

{'user': {'1.jpg': {},
          'folder1': {'1.txt': {},
                      'blah.txt': {},
                      'folder2': {'random.txt': {}}},
          'folder3': {'folder4': {'folder5': {'1.txt': {},
                                              '2.txt': {},
                                              '3.txt': {}}}},
          'hey.jpg': {}}}

If it's all right to have dicts instead of lists, easy to change anyway.

edited Feb 04 '17 at 11:13

answered Feb 03 '17 at 19:17

BartoszKP

34,786
15
102
130

@Bakuriu I know I can use a code block. The problem is that program's output is not code, so quote block should be used. – BartoszKP Feb 03 '17 at 19:25
Code blocks have to be used for all text that has to be interpreted literally. For example traceback and error messages and program input/output, since the exact contents are essential. Moreover, if we follow your logic program output is *certainly* not a **quote** of somebody else, as intended for the quote blocks . – Bakuriu Feb 03 '17 at 20:18
@Bakuriu I've failed to find any guidelines for this. IMHO code should be for something that can be compiled, and quote is for representing any other piece of text that's not your own words (so not only somebody else's but also documentation and program's output). I admit however that, leaving definitions aside, your edit did improve readability. – BartoszKP Feb 04 '17 at 11:12

Can os.walk() be used on a list of strings? If not, what would be the equivalent?

1 Answers1

Linked