143

Can anybody help me create a function which will create a list of all files under a certain directory by using pathlib library?

Here, I have a:

enter image description here

I have

  • c:\desktop\test\A\A.txt

  • c:\desktop\test\B\B_1\B.txt

  • c:\desktop\test\123.txt

I expected to have a single list which would have the paths above, but my code returns a nested list.

Here is my code:

from pathlib import Path

def searching_all_files(directory: Path):   
    file_list = [] # A list for storing files existing in directories

    for x in directory.iterdir():
        if x.is_file():

           file_list.append(x)
        else:

           file_list.append(searching_all_files(directory/x))

    return file_list


p = Path('C:\\Users\\akrio\\Desktop\\Test')

print(searching_all_files(p))

Hope anybody could correct me.

martineau
  • 119,623
  • 25
  • 170
  • 301
Akrios
  • 1,637
  • 2
  • 10
  • 12

13 Answers13

215

Use Path.glob() to list all files and directories. And then filter it in a List Comprehensions.

p = Path(r'C:\Users\akrio\Desktop\Test').glob('**/*')
files = [x for x in p if x.is_file()]

More from the pathlib module:

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
prasastoadi
  • 2,536
  • 2
  • 16
  • 14
  • what if I want to list all directories in a directory? – Charlie Parker Jul 20 '20 at 15:48
  • 9
    To list all directories simply replace "x.is_file()" with "x.is_dir()" as described in the [docs](https://docs.python.org/3/library/pathlib.html#pathlib.Path.is_dir) – Jonas Aug 09 '20 at 14:47
56

With pathlib, it is as simple as the below comand.

path = Path('C:\\Users\\akrio\\Desktop\\Test')    
list(path.iterdir())
Aditya Bhatt
  • 823
  • 1
  • 8
  • 7
  • 17
    Need only list of files (not dirs)? one liner: [f for f in Path(path_to_dir).iterdir() if f.is_file()] – Povilas Jan 25 '22 at 13:04
  • 1
    Wrong. `iterdir` lists only the files in the directory but the OP has made it plain that he/she wants an explorer which will search down through the whole structure. – mike rodent May 11 '23 at 13:14
36
from pathlib import Path
from pprint import pprint

def searching_all_files(directory):
    dirpath = Path(directory)
    assert dirpath.is_dir()
    file_list = []
    for x in dirpath.iterdir():
        if x.is_file():
            file_list.append(x)
        elif x.is_dir():
            file_list.extend(searching_all_files(x))
    return file_list

pprint(searching_all_files('.'))
MichielB
  • 4,181
  • 1
  • 30
  • 39
  • 2
    assert is a statement, not a function, so I think you want `assert dirpath.is_dir()` with no parenthesis. In Python 2 and 3. Or simply `assert dirpath.exists()` – PatrickT Jun 15 '20 at 06:57
  • 1
    You should not use assert outside unit testing because assert is not working in some context. – Swann_bm Aug 23 '22 at 13:21
18

If you can assume that only file objects have a . in the name (i.e., .txt, .png, etc.) you can do a glob or recursive glob search...

from pathlib import Path

# Search the directory
list(Path('testDir').glob('*.*'))

# Search directories and subdirectories, recursively
list(Path('testDir').rglob('*.*'))

But that's not always the case. Sometimes there are hidden directories like .ipynb_checkpoints and files that do not have extensions. In that case, use list comprehension or a filter to sort out the Path objects that are files.

# Search Single Directory
list(filter(lambda x: x.is_file(), Path('testDir').iterdir()))

# Search Directories Recursively
list(filter(lambda x: x.is_file(), Path('testDir').rglob('*')))
# Search Single Directory
[x for x in Path('testDir').iterdir() if x.is_file()]

# Search Directories Recursively
[x for x in Path('testDir').rglob('*') if x.is_file()]
blaylockbk
  • 2,503
  • 2
  • 28
  • 43
15

A similar, more functional-oriented solution to @prasastoadi's one can be achieved by using the built-in filter function of Python:

from pathlib import Path

my_path = Path(r'C:\Users\akrio\Desktop\Test')
list(filter(Path.is_file, my_path.glob('**/*')))
Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
chickenNinja123
  • 311
  • 2
  • 11
13

If your files have the same suffix, like .txt, you can use rglob to list the main directory and all subdirectories, recursively.

paths = list(Path(INPUT_PATH).rglob('*.txt'))

If you need to apply any useful Path function to each path. For example, accessing the name property:

[k.name for k in Path(INPUT_PATH).rglob('*.txt')]

Where INPUT_PATH is the path to your main directory, and Path is imported from pathlib.

tk3
  • 990
  • 1
  • 13
  • 18
8

Define path to directory:

from pathlib import Path
data_path = Path.home() / 'Desktop/My-Folder/'

Get all paths (files and directories):

paths = sorted(data_path.iterdir())

Get file paths only:

files = sorted(f for f in Path(data_path).iterdir() if f.is_file())

Get paths with specific pattern (e.g. with .png extension):

png_files = sorted(data_path.glob('*.png'))
Miladiouss
  • 4,270
  • 1
  • 27
  • 34
4

Using pathlib2 is much easier,

from pathlib2 import Path

path = Path("/test/test/")
for x in path.iterdir():
    print (x)
HMan06
  • 755
  • 2
  • 9
  • 23
  • 5
    pathlib2 is deprecated. – Nico Schlömer Dec 22 '20 at 08:52
  • I didn't see how pathlib2 is related to this question. It seems that pathlib2 is only a backport of pathlib (to Python 2.x) and therefore `path.iterdir()` in pathlib2 cannot recursively walk the directory. – xmcp Dec 27 '21 at 16:34
2
def searching_all_files(directory: Path):   
    file_list = [] # A list for storing files existing in directories

    for x in directory.iterdir():
        if x.is_file():
            file_list.append(x)#here should be appended
        else:
            file_list.extend(searching_all_files(directory/x))# need to be extended

    return file_list
Julien
  • 13,986
  • 5
  • 29
  • 53
Akrios
  • 1,637
  • 2
  • 10
  • 12
2
import pathlib

def get_all_files(dir_path_to_search):
    filename_list = []

    file_iterator = dir_path_to_search.iterdir()

    for entry in file_iterator:
            if entry.is_file():
                #print(entry.name)
                filename_list.append(entry.name)

    return filename_list

The function can we tested as -

dir_path_to_search= pathlib.Path("C:\\Users\\akrio\\Desktop\\Test")
print(get_all_files(dir_path_to_search))
Vineet Sharma
  • 444
  • 2
  • 6
  • 8
1

You can use this:

folder: Path = Path('/path/to/the/folder/')

files: list = [file.name for file in folder.iterdir()]
dslackw
  • 83
  • 1
  • 7
0

You can use a generator like this one with online filtering:

for file in (_ for _ in directory.iterdir() if _.is_file()):
    ...
V.Mach
  • 1
  • 1
-3

You can use os.listdir(). It will get you everything that's in a directory - files and directories.

If you want just files, you could either filter this down using os.path:

from os import listdir
from os.path import isfile, join
onlyfiles = [files for files in listdir(mypath) if isfile(join(mypath, files))]

or you could use os.walk() which will yield two lists for each directory it visits - splitting into files and directories for you. If you only want the top directory you can just break the first time it yields

from os import walk
files = []
for (dirpath, dirnames, filenames) in walk(mypath):
    files.extend(filenames)
    break
Aman Jaiswal
  • 1,084
  • 2
  • 18
  • 36