11

In Unix all disks are exposed as paths in the main filesystem, so os.walk('/') would traverse, for example, /media/cdrom as well as the primary hard disk, and that is undesirable for some applications.

How do I get an os.walk that stays on a single device?

Related:

Community
  • 1
  • 1
joeforker
  • 40,459
  • 37
  • 151
  • 246
  • http://stackoverflow.com/questions/530645/is-there-a-way-to-determine-if-a-subdirectory-is-in-the-same-filesystem-from-pyth/530692#530692 – sykora Feb 23 '09 at 14:44

3 Answers3

19

From os.walk docs:

When topdown is true, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search

So something like this should work:

for root, dirnames, filenames in os.walk(...):
  dirnames[:] = [
    dir for dir in dirnames
    if not os.path.ismount(os.path.join(root, dir))]
  ...
Constantin
  • 27,478
  • 10
  • 60
  • 79
  • a brilliantly concise answer. – Matt Joiner Oct 31 '09 at 01:57
  • As noted on the other comment: You can fool this with symlinks deeper into another filesystem. A better approach is to save the st_dev of the initial path (e.g. `dev0 = os.stat(startpath).st_dev`) and filter as `dirnames[:] = [d for d in dirnames if os.stat(os.path.join(root,d)).st_dev == dev0]` – Justin Winokur Mar 08 '22 at 14:29
3

I think os.path.ismount might work for you. You code might look something like this:

import os
import os.path
for root, dirs, files in os.walk('/'):
    # Handle files.
    dirs[:] = filter(lambda dir: not os.path.ismount(os.path.join(root, dir)), 
                  dirs)

You may also find this answer helpful in building your solution.

*Thanks for the comments on filtering dirs correctly.

Community
  • 1
  • 1
zweiterlinde
  • 14,557
  • 2
  • 27
  • 32
  • You can fix is by chinging your code to "dirs[:] = filter(...)" instead, mutating the list in-place rather than reassiging – Brian Feb 23 '09 at 14:59
  • You can fool this with symlinks deeper into another filesystem. A better approach is to save the st_dev of the initial path (e.g. `dev0 = os.stat(startpath).st_dev`) and filter as `dirs[:] = [d for d in dirs if os.stat(os.path.join(root,d)).st_dev == dev0]` – Justin Winokur Mar 08 '22 at 14:28
1

os.walk() can't tell (as far as I know) that it is browsing a different drive. You will need to check that yourself.

Try using os.stat(), or checking that the root variable from os.walk() is not /media

Lesmana
  • 25,663
  • 9
  • 82
  • 87
Jason Coon
  • 17,601
  • 10
  • 42
  • 50