-1

I have a list of directory names:

dirnames = ["foo/bar", "foo/bar/mydir", "bar/mydir", "bar"]

I would like to (recursively) create all directories. What suffices is to recursively create the deepest directories only. In this case, it is sufficient to:

os.makedirs("foo/bar/mydir")
os.makedirs("bar/mydir")

The question: How do reduce dirnames to have the deepest directories only?

Tom de Geus
  • 5,625
  • 2
  • 33
  • 77
  • what have you tried so far? – deadshot Feb 15 '21 at 16:13
  • 2
    A simple for loop has O(n) complexity to create all directories. Shortening to only the deepest directories will actually cost you time (in the Big Oh sense) if the algorithm to do so is worse than O(n). – Shane Bishop Feb 15 '21 at 16:15
  • 1
    `sorted([(len(i.split('/')), i) for i in dirnames])[-2:]` will give top 2 deepest dirnames – Epsi95 Feb 15 '21 at 16:15
  • @deadshot I did not think the question would have improved with my failed attempts. I started writing things that were way too complicated, and, it turns out in the wrong direction, see the nice answer below. In hindsight, the question maybe should have been shortened by leaving out the context. – Tom de Geus Feb 15 '21 at 16:59

2 Answers2

1
dirnames = ["foo/bar", "foo/bar/mydir", "bar/mydir", "bar","foo/bar/dir2"]
from itertools import groupby
import heapq
d=[sorted(list(g)) for (k,g) in groupby(sorted(dirnames), key= lambda x: x.split('/')[0])]
[max(elem) for elem in d]

or if you need the depth of 2

[heapq.nlargest(2, elem) for elem in d]

Using groupby on parent directory. i.e foo or bar.If we use max we gonna get only foo/bar/mydir because it has more length than foo/bar/dir2 . To avoid this we can get heapq but it will be an overkill. An additional if clause can be used to check if the inner list has len >2

Ajay
  • 5,267
  • 2
  • 23
  • 30
1

While others have posted very useful iterative solutions, here is a way by which the full directories can be selected recursively:

from collections import defaultdict
def to_tree(d):
  t = defaultdict(list)
  for a, *b in d:
     t[a].append(b)
  return {a:None if not (k:=list(filter(None, b))) else to_tree(k) for a, b in t.items()}

def get_paths(d, c = []):
  for a, b in d.items():
     if b is None:
        yield '/'.join(c+[a])
     else:
        yield from get_paths(b, c + [a])


dirnames = ["foo/bar", "foo/bar/mydir", "bar/mydir", "bar"]
print(list(get_paths(to_tree([i.split('/') for i in dirnames]))))

Output:

['foo/bar/mydir', 'bar/mydir']
Ajax1234
  • 69,937
  • 8
  • 61
  • 102