0

I'm using os.walk to run through directory "foo". I want to process .dat files but how to check for a directory name and only process the specific directory?

If dir="bar" then process files.dat. Do not process "notbar". I'm probably missing something simple

 C:\data\foo
       - notbar
           -123
             -file1.dat
           -456
             -file2.dat
             -file3.dat
       - bar
           -123
             -file1.dat
           -456
             -file2.dat
             -file3.dat

this finds all .dat files....

    for (root, dirnames, filenames) in os.walk(base_path):
        print('Found directory: {0}'.format(root))
        for filename in filenames:
            if filename.endswith(".dat"):
                print(filename)

rdebruyn
  • 54
  • 8

2 Answers2

1

glob is really good for this. It returns all the files that match a certain pattern.

There is a reference for the patterns, but the most useful are:

  • * matches everything except path slashes (\ for windows, / for mac / linux)
  • ** matches zero or more directories

In your example, you want to find the .dat (*.dat) files in any sub-directory (*) of a sub-directory (bar) inside a base path base_path. To get these files we can write

from glob import glob

filenames = glob(base_path + "\\bar\\*\\*.dat")

It is better to use os.path.join for cross-platform:

from glob import glob

filenames = glob(os.path.join(base_path, "bar", "*", "*.dat"))

Check out the results here

If bar is not necessarily the immediate sub-directory of base_path, but nested further down, you could use **:

from glob import glob

filenames = glob(os.path.join(base_path, "**", "bar", "*", "*.dat"))

Finally, glob will not necessarily return the files in any order. To get them in alphabetical order use sorted(filenames). To get them in modified order use sorted(filenames, key=os.path.getmtime) as per this answer.

geometrikal
  • 3,195
  • 2
  • 29
  • 40
0

As stated in the comment, a possible solution can be to do a second os.walk. In detail

databasePath = ".\database"

import os

for (root, dirs, files) in os.walk(databasePath):
    for dir in dirs:
    if dir == "myLabel":
        for (_root, _dirs, _files) in os.walk(os.path.join(root, dir)):
            i = 0
            print(_files)
Jonny_92
  • 42
  • 9