0

The problem is to get all the file names in a list that are under a particular directory and in a particular condition.

We have a directory named "test_dir".

There, we have sub directory "sub_dir_1", "sub_dir_2", "sub_dir_3"

and inside of each sub dir, we have some files.

sub_dir_1 has files ['test.txt', 'test.wav']
sub_dir_2 has files ['test_2.txt', 'test.wav']
sub_dir_2 has files ['test_3.txt', 'test_3.tsv']

What I want to get at the end of the day is a list of of the "test.wav" that exist under the "directory" ['sub_dir_1/test.wav', 'sub_dir_2/test.wav']. As you can see the condition is to get every path of 'test.wav' under the mother directory.

mother_dir_name = "directory"
get_test_wav(mother_dir_name)
returns --> ['sub_dir_1/test.wav', 'sub_dir_2/test.wav']

EDITED I have changed the direction of the problem.

We first have this list of file names

["sub_dir_1/test.wav","sub_dir_2/test.wav","abc.csv","abc.json","sub_dir_3/test.json"]

from this list I would like to get a list that does not contain any path that contains "test.wav" like below

["abc.csv","abc.json","sub_dir_3/test.json"]
  • 1
    Does this answer your question? [Get a filtered list of files in a directory](https://stackoverflow.com/questions/2225564/get-a-filtered-list-of-files-in-a-directory) – mkrieger1 Jan 07 '21 at 17:02
  • `glob.glob("*/test.wav")`? – Barmar Jan 07 '21 at 17:03
  • Thanks. okay so that worked. What if i want to filter out from a list ['sub_dir_1/test.wav','sub_dir_2/test.wav','sub_dir_1/test.txt','someother.file', 'file_4.tsv'] from this list to ['sub_dir_1/test.wav','sub_dir_2/test.wav'] – AI Downloader Jan 07 '21 at 17:16

3 Answers3

0

Use os.walk():

import os

def get_test_wav(folder):
    found = []
    for root, folders, files in os.walk(folder):
        for file in files:
            if file == "test.wav":
                found.append(os.path.join(root, file))
    return found

Or a list comprehension approach:

import os

def get_test_wav(folder):
    found = [f"{arr[0]}\\test.wav" for arr in os.walk(folder) if "test.wav" in arr[2]]
    return found
Red
  • 26,798
  • 7
  • 36
  • 58
0

You can use glob patterns for this. Using pathlib,

from pathlib import Path
mother_dir = Path("directory")
list(mother_dir.glob("sub_dir_*/*.wav"))

Notice that I was fairly specific about which subdirectories to check - anything starting with "sub_dir_". You can change that pattern as needed to fit your environment.

tdelaney
  • 73,364
  • 6
  • 83
  • 116
0

I think this might help you How can I search sub-folders using glob.glob module? The main way to make a list of files in a folder (to make it callable later) is:

file_path = os.path.join(motherdirectopry, 'subdirectory')
list_files = glob.glob(file_path + "/*.wav")

just check that link to see how you can join all sub-directories in a folder.

This will also give you all the file in sub directories that only has .wav at the end:

os.chdir(motherdirectory)
glob.glob('**/*.wav', recursive=True)
afshan
  • 13
  • 3