51

I need to list all files with the containing directory path inside a folder. I tried to use os.walk, which obviously would be the perfect solution.

However, it also lists hidden folders and files. I'd like my application not to list any hidden folders or files. Is there any flag you can use to make it not yield any hidden files?

Cross-platform is not really important to me, it's ok if it only works for linux (.* pattern)

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
unddoch
  • 5,790
  • 1
  • 24
  • 37

4 Answers4

114

No, there is no option to os.walk() that'll skip those. You'll need to do so yourself (which is easy enough):

for root, dirs, files in os.walk(path):
    files = [f for f in files if not f[0] == '.']
    dirs[:] = [d for d in dirs if not d[0] == '.']
    # use files and dirs

Note the dirs[:] = slice assignment; os.walk recursively traverses the subdirectories listed in dirs. By replacing the elements of dirs with those that satisfy a criteria (e.g., directories whose names don't begin with .), os.walk() will not visit directories that fail to meet the criteria.

This only works if you keep the topdown keyword argument to True, from the documentation of os.walk():

When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting, or even to inform walk() about directories the caller creates or renames before it resumes walk() again.

Minh Tran
  • 494
  • 7
  • 17
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • 1
    Thanks a lot, didn't know that you can modify the lists in place! – unddoch Nov 19 '12 at 12:55
  • I ran this but it did not print anything to the console. What is the typical method to display the files found this way? I added `print root, dirs, files` at the end but it came out very messy. – user5359531 Jun 17 '16 at 18:47
  • 1
    @user5359531: that depends entirely on your usecase; you could `print '\n'.join([os.path.join(root, f) for f in dirs + files])`, etc. – Martijn Pieters Jun 17 '16 at 19:09
  • 1
    Wander can `files = [f for f in files if not f[0] == '.']` write like `files[:] = [f for f in files if not f[0] == '.']`, just like `dirs[:]` ? – linrongbin Sep 18 '18 at 07:48
  • 4
    @linrongbin: you could, but there would be no advantage in that. `files = [...]` binds `files` to a new list, and `files[:] = [...]` replaces the elements in the list `files` is already bound to. No other code is using that list when `os.walk()` gives it to you. `dirs` on the other hand is used by `os.walk()` to find the next directories to go and produce files for, so if you did *not* use `dirs[:] = [...]` then directories that start with a `.` are still going to be visited. – Martijn Pieters Sep 18 '18 at 15:49
  • @MartijnPieters Cool, thanks, I'm not so familiar with python. – linrongbin Sep 19 '18 at 09:08
  • 1
    @linrongbin: also see https://nedbatchelder.com/text/names.html to understand how Python variables work; it'll help to understand how the `os.walk()` implementation and the `dir[:] = [...]` interact. – Martijn Pieters Sep 19 '18 at 10:34
13

I realize it wasn't asked in the question, but I had a similar problem where I wanted to exclude both hidden files and files beginning with __, specifically __pycache__ directories. I landed on this question because I was trying to figure out why my list comprehension was not doing what I expected. I was not modifying the list in place with dirnames[:].

I created a list of prefixes I wanted to exclude and modified the dirnames in place like so:

    exclude_prefixes = ('__', '.')  # exclusion prefixes
    for dirpath, dirnames, filenames in os.walk(node):
        # exclude all dirs starting with exclude_prefixes
        dirnames[:] = [dirname
                       for dirname in dirnames
                       if not dirname.startswith(exclude_prefixes)]
ideasman42
  • 42,413
  • 44
  • 197
  • 320
dmmfll
  • 2,666
  • 2
  • 35
  • 41
  • that is a great answer, works perfectly for excluding according to a list – jpw Feb 17 '16 at 01:08
  • 2
    FYI, `startswith` can also take a tuple of strings, so you can get rid of the inner for loop and just use `not dirname.startswith(exclude_prefixes)` https://docs.python.org/2/library/stdtypes.html#str.startswith (python 2.5 and up) – Daniel Rucci Nov 14 '16 at 19:32
2

My use-case was similar to that of OP, except I wanted to return a count of the total number of sub-directories inside a certain folder. In my case I wanted to omit any sub-directories named .git (as well as any folders that may be nested inside these .git folders).

In Python 3.6.7, I found that the accepted answer's approach didn't work -- it counted all .git folder and their sub-folders. Here's what did work for me:

num_local_subdir = 0
for root, dirs, files in os.walk(local_folder_path):
    if '.git' in dirs:
        dirs.remove('.git')
    num_local_subdir += (len(dirs))
James Dellinger
  • 1,281
  • 8
  • 9
0

Another solution that can allow you to skip those hidden folders using any and map functions.

for root, dirs, files in os.walk(path):
    if any(map(lambda p: p[0] == '.', dirs)):
        continue