1

Suppose a directory structure as:

├── parent_1
│   ├── child_1
│   │   ├── sub_child_1
│   │   │   └── file_1.py
│   │   └── file_2.py
│   └── file_3.py
├── parent_2
│   └── child_2
│       └── file_4.py
└── file_5.py

I want to get two arrays:

parents = ["parent_1", "parent_2"]
children = ["child_1", "child_2"]

Note that files and sub_child_1 are not included.

Using suggestions such as this, I can write:

parents = []
children = []
for root, dir, files in os.walk(path, topdown=True):
    depth = root[len(path) + len(os.path.sep):].count(os.path.sep)
    if depth == 0:
        parents.append(dir)
    elif depth == 1:
        children.append(dir)

However, this is a bit wordy and I was wondering if there is a cleaner way of doing this.

Update 1

I also tried a listdir-based approach:

parents = [f for f in listdir(root) if isdir(join(root, f))]
children = []
for p in parents:
    children.append([f for f in listdir(p) if isdir(join(root, p, f))])
Dr. Strangelove
  • 2,725
  • 3
  • 34
  • 61
  • 1
    Did you also try a `listdir`-based approach, since you have a fixed number of levels you want to look at? – Karl Knechtel Aug 06 '21 at 04:15
  • Thanks for the suggestion, yes, I tried that approach, please see the updated question. The second loop was not so exciting to me though. – Dr. Strangelove Aug 06 '21 at 04:30
  • 2
    @KarlKnechtel The problem with `os.listdir` is inefficiency, since for each record it returns you would have to check if it is a directory by making a separate system call of `os.path.isdir`, which is dramatically slow when dealing with large directories. `os.scandir` would be a better choice for this approach. – blhsing Aug 06 '21 at 04:38

1 Answers1

2

You can clear the directories returned by os.walk to prevent it from traversing deeper when it has reached your desired depth:

for root, dirs, _ in os.walk(path, topdown=True):
    if root == path:
        continue
    parents.append(root)
    children.extend(dirs)
    dirs.clear()
blhsing
  • 91,368
  • 6
  • 71
  • 106