3

I am attempting to automate the specification of a sub-directory which one of my scripts requires. The idea is to have the script search the C: drive for a folder of a specific name. In my mind, this begs for a recursive search function. The plan is to check all sub-directories, if none are the desired directory, begin searching the sub-directories of the current sub-directories

While researching how to do this, I came across this question and started using os.walk(dir).next()[1] to list directories. This had limited success. As the script searched through directories, it would essentially give up and break after, giving the StopIteration error. Sample output is below searching for a sub-directory within TEST1.

C:\Python27>test.py
curDir:  C:\Python27
['DLLs', 'Doc', 'include', 'Lib', 'libs', 'pyinstaller-2.0', 'Scripts', 'tcl', 'TEST1',     'Tools']
curDir:  DLLs
[]
curDir:  Doc
[]
curDir:  include
[]
curDir:  Lib
['bsddb', 'compiler', 'ctypes', 'curses', 'distutils', 'email', 'encodings', 'hotshot',     
'idlelib', 'importlib', 'json', 'lib-tk', 'lib2to3', 'logging', 'msilib', 
'multiprocessing', 'pydoc_data', 'site-packages', 'sqlite3', 'test', 'unittest', 'wsgiref', 'xml']
curDir:  bsddb
Traceback (most recent call last):
  File "C:\Python27\test.py", line 24, in <module>
    if __name__ == "__main__": main()
  File "C:\Python27\test.py", line 21, in main
    path = searcher(os.getcwd())
  File "C:\Python27\test.py", line 17, in searcher
    path = searcher(entry)
  File "C:\Python27\test.py", line 17, in searcher
    path = searcher(entry)
  File "C:\Python27\test.py", line 6, in searcher
    dirList = os.walk(dir).next()[1]
StopIteration

curDir is the the current directory that is being searched and the next line of output is the list of subdirectories. Once the program finds a directory with no sub-directories, it kicks back up one level and goes to the next directory.

I can provide my code if required, but didn't want to initially post it to avoid an even bigger wall of text.

My question is: why does the script give up after searching a few folders? Thanks in advance for your help!

Community
  • 1
  • 1
wnnmaw
  • 5,444
  • 3
  • 38
  • 63

2 Answers2

5

StopIteration is raised whenever an iterator has no more values to generate.

Why are you using os.walk(dir).next()[1]? Wouldn't it be easier to just do everything in a for loop? Like:

for root, dirs, files in os.walk(mydir):
    #dirs here should be equivalent to dirList

Here is the documentation for os.walk.

mr2ert
  • 5,146
  • 1
  • 21
  • 32
  • Yes, that is much easier, brilliant! Thanks! – wnnmaw Aug 06 '13 at 16:01
  • @wnnmaw I added a link to the documentation for `os.walk`. It is a pretty powerful command. – mr2ert Aug 06 '13 at 16:08
  • Using the for loop works perfectly, but very slowly. Searching through a 600 GB drive takes about 10 minutes. Is there a faster way to invoke os.walk, or a substitute command which will do it faster? – wnnmaw Aug 07 '13 at 15:34
  • I'm not sure. Ideally you will want to walk the entire drive once, and create an index/database of the directories, and the query the database. Similar to the `locate` command in bash. – mr2ert Aug 07 '13 at 17:16
1

What worked for me is specifying the full path in os.walk, rather than just the directory name:

# fullpath of the directory of interest with subfolders to be iterated (Mydir)
fullpath = os.path.join(os.path.dirname(__file__),'Mydir')

# iteration
subfolders = os.walk(fullpath).next()[1]

This happened to me in particular when a module that contains os.walk is located in a subfolder itself, imported by a script in a parent folder.

Parent/
    script
    Folder/
        module
        Mydir/
            Subfolder1
            Subfolder2

In script, os.walk('Mydir') will look in Parent/Mydir, which does not exist.

On the other hand, os.walk(fullpath) will look in Parent/Folder/Mydir.

Daniel
  • 351
  • 4
  • 11