0

I am looking to recursively search through a folder containing many sub-folders. Some sub-folders contain a specific folder that I want to loop through.

I am familiar with the glob.glob method to find specific files:

import glob, os
from os import listdir
from os.path import isfile, join

os.chdir(pathname) #change directory to path of choice
files = [f for f in glob.glob("filename.filetype") if isfile(join(idir, f))]

Some sub folders in a directory have a time stamp (YYYYMMDD) as their names all containing identical file names. Some of those sub folders contain folders within them with a name, let's call it "A". I'm hoping to create a code that will recursively search for the folder within called "A" within these "specific sub folders". Is there a way to use glob.glob to find these specific sub-folders within a directory?

I am aware of a similar question: How can I search sub-folders using glob.glob module in Python?

but this person seems to be looking for specific files, whereas I am looking for pathnames.

Community
  • 1
  • 1
Dax Feliz
  • 12,220
  • 8
  • 30
  • 33
  • How would you define `specific sub-folders` – ZdaR Mar 15 '17 at 03:30
  • 1
    Possible duplicate of [How can I search sub-folders using glob.glob module in Python?](http://stackoverflow.com/questions/14798220/how-can-i-search-sub-folders-using-glob-glob-module-in-python) – ZdaR Mar 15 '17 at 03:33
  • Sorry about that, I should have been more clear. Some sub folders in a directory have a time stamp (YYYYMMDD) as their names all containing identical file names. Some of those sub folders contain folders within them with a name, let's call it "A". I'm hoping to create a code that will recursively search for the folder called "A" within these "specific sub folders". – Dax Feliz Mar 15 '17 at 03:33
  • 1
    Then you need to try `os.walk()` – ZdaR Mar 15 '17 at 03:36
  • Are these folders always at the same level? Suppose it starts 3 subdirs down, you could do `glob('*/*/*/*[12][0-9][1-9][0-9][01][0-9][0123][0-9]*/A/')` – tdelaney Mar 15 '17 at 04:29
  • Could you edit the question to show exactly what a typical directory structure looks like. – Martin Evans Mar 15 '17 at 08:09

1 Answers1

1

You can use os.walk which will walk the tree. Each iteration shows you the directory and its immediate subdirectories, so the test is simple.

import os
import re

# regular expression to match YYYYMMDD timestamps (but not embedded in
# other numbers like 2201703011).
timestamp_check = re.compile(re.compile(r"[^\d]?[12]\d3[01]\d[0123]\d")).search

# Option 1: Stop searching a subtree if pattern is found
A_list = []
for root, dirs, files in os.walk(pathname):
    if timestamp_check(os.path.basename(root)) and 'A' in dirs:
        A_list.append(os.path.join(root, A))
        # inplace modification of `dirs` trims subtree search
        del dirs[:]

# Option 2: Search entire tree, even if matches found
A_list = [os.path.join(root, 'A') 
    for root, dirs, files in os.walk(pathname) 
    if timestamp_check(os.path.basename(root)) and 'A' in dirs]
tdelaney
  • 73,364
  • 6
  • 83
  • 116
  • This is effectively what the nominated duplicate question's accepted answer amounts to. – tripleee Mar 15 '17 at 05:29
  • @tripleee - they are similar, but OP has more requirements than "*.txt". This answer shows how to filter subdirectories and how to use a regular expression for a more exact match than you can get from `fnmatch`. – tdelaney Mar 15 '17 at 15:34