15

Possible Duplicate:
How to join two generators in Python?

Is there a way in python to use os.walk to traverse multiple directories at once?

my_paths = []
path1 = '/path/to/directory/one/'
path2 = '/path/to/directory/two/'
for path, dirs, files in os.walk(path1, path2):
    my_paths.append(dirs)

The above example doesn't work (as os.walk only accepts one directory), but I was hoping for a more elegant solution rather than calling os.walk twice (plus then I can sort it all at once). Thanks.

Community
  • 1
  • 1
John P
  • 720
  • 2
  • 8
  • 16

4 Answers4

32

To treat multiples iterables as one, use itertools.chain:

from itertools import chain

paths = ('/path/to/directory/one/', '/path/to/directory/two/', 'etc.', 'etc.')
for path, dirs, files in chain.from_iterable(os.walk(path) for path in paths):
agf
  • 171,228
  • 44
  • 289
  • 238
6

Use itertools.chain().

for path, dirs, files in itertools.chain(os.walk(path1), os.walk(path2)):
    my_paths.append(dirs)
David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
5

Others have mentioned itertools.chain.

There's also the option of just nesting one level more:

my_paths = []
for p in ['/path/to/directory/one/', '/path/to/directory/two/']:
    for path, dirs, files in os.walk(p):
        my_paths.append(dirs)
Steven Rumbalski
  • 44,786
  • 9
  • 89
  • 119
1

since nobody mentioned it, in this or the other referenced post:

http://docs.python.org/library/multiprocessing.html

>>> from multiprocessing import Pool
>>> p = Pool(5)
>>> def f(x):
...     return x*x
...
>>> p.map(f, [1,2,3])

in this case, you'd have a list of directories. the call to map would return a list of lists from each dir, you could then choose to flatten it, or keep your results clustered

def t(p):
    my_paths = []
    for path, dirs, files in os.walk(p):
        my_paths.append(dirs)


paths = ['p1','p2','etc']
p = Pool(len(paths))
dirs = p.map(t,paths)
pyInTheSky
  • 1,459
  • 1
  • 9
  • 24
  • He doesn't mean "at once" as in "at the same time" but as in "as a set" or "as a unit", so your answer doesn't really address his question. – agf Sep 28 '11 at 21:01
  • 1
    I believe it does both right? Not only do you get back your search along multiple paths as a list, which is what everyone's chain() suggestion does, but this has the added benefit of doing all these searches as a separate process. What if these are paths do unique drives. If that's the case you get even better results using this method since you are searching multiple drives simultaneously. – pyInTheSky Sep 28 '11 at 21:12