3

I'm looking for an option to search specific subdirectories in python.

For instance a directory structure like this:

some_files/
     common/
     2009/
     2010/
     2011/
     ...

I only want to search for in the subdirectories that start with a 2, so that it must be something like 'some_files/2*'. I think it must be possible using glob.glob and os.walk(), but I can't get it to work.

Now i use:

files = [os.path.join(dirpath, f)
                for dirpath, dirnames, files in os.walk(d)
                for f in files if f.endswith(ext)]

but this doesn't fit the specific needs.

Can someone help me out, would be much appreciated!

smci
  • 32,567
  • 20
  • 113
  • 146
Juvawa
  • 50
  • 7
  • What's wrong with something like `and os.path.join('some_files', '2') in dirpath`? – TigerhawkT3 Jun 16 '15 at 21:37
  • 1
    The tool I'm developing looks for specific files in user specified places. These places are specified in a config file. So they can say, you may look in this folder (some_files/), then it searches all the subdirectories as well. I want to give the user an option to only search in specific subdirectories, with a statement like somefiles/2*. If i understand it correctly the option you give won't support that. Please correct me if I'm wrong :) – Juvawa Jun 16 '15 at 21:42
  • Why wouldn't it work? It just checks whether the specified string is found in the path string. – TigerhawkT3 Jun 16 '15 at 21:43
  • What do you want to do with files contained in directories such as 'some_files/2011/foo' and 'some_files/bar/2011'? – zehnpaard Jun 16 '15 at 22:30
  • You meant *"(recursively) expand and search (wildcarded) pattern of specific subdirectories"*. That's a lot more than just saying *"specific subdirectories"*. I edited the title for clarity. – smci Jun 16 '15 at 23:36
  • 'some_files/2011/foo' must be searched as well, 'some_files/bar/2011' should not be searched since 'bar' doesn't start with a 2. Thank smci for the new title, that indeed covers it better! – Juvawa Jun 17 '15 at 07:33

2 Answers2

3

I would do it like this using pathlib (which is now part of the Python3 std lib):

from pathlib import Path

for subpath in Path().glob("2*):
    for file in subpath.glob("*.ext"):
        # ...

Update: pathlib is also available for Python 2.x (it was back-ported and published to the Python Package Index). Simply:

$ pip install pathlib
James Mills
  • 18,669
  • 3
  • 49
  • 62
2

You can use glob with dirpath to find matching directories:

from glob import iglob
import os

files = []
ext = "py"
for dirpath, dirnames, file in os.walk(path):
    match = next(iglob(os.path.join(dirpath, "2*")),"")
    if match:
        files.extend(iglob(os.path.join(match,"*.{}".format(ext))))
print(files)

Or if you really want a list comp:

files = [f for dirpath, dirnames, file in os.walk(path) for f in
         iglob(os.path.join(next(iglob(os.path.join(dirpath, "2*")),
                                 '\\\\'), "*.{}".format(ext)))]
print(files)
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321