930

Is there a way to return a list of all the subdirectories in the current directory in Python?

I know you can do this with files, but I need to get the list of directories instead.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Brad Zeis
  • 10,085
  • 5
  • 26
  • 20
  • 4
    https://docs.python.org/3.4/library/os.html?highlight=os#os.listdir https://docs.python.org/3.4/library/os.path.html#os.path.isdir – The Demz Dec 02 '14 at 03:51
  • 20
    If you are looking for a pathlib solution do `[f for f in data_path.iterdir() if f.is_dir()]` credit: https://stackoverflow.com/a/44228436/1601580. this gives you as strings folder names. Somehow it also excludes `.` and `..` thank god. The Glob solution is worthwhile too: `glob.glob("/path/to/directory/*/")`. – Charlie Parker Jul 20 '20 at 16:31

35 Answers35

915

Do you mean immediate subdirectories, or every directory right down the tree?

Either way, you could use os.walk to do this:

os.walk(directory)

will yield a tuple for each subdirectory. Ths first entry in the 3-tuple is a directory name, so

[x[0] for x in os.walk(directory)]

should give you all of the subdirectories, recursively.

Note that the second entry in the tuple is the list of child directories of the entry in the first position, so you could use this instead, but it's not likely to save you much.

However, you could use it just to give you the immediate child directories:

next(os.walk('.'))[1]

Or see the other solutions already posted, using os.listdir and os.path.isdir, including those at "How to get all of the immediate subdirectories in Python".

Brian Burns
  • 20,575
  • 8
  • 83
  • 77
Blair Conrad
  • 233,004
  • 25
  • 132
  • 111
  • 3
    Such a clean and nice answer. Thank you. I wasn't familiar with next() and thought this link can be helpful to whoever in similar situation: http://stackoverflow.com/questions/1733004/python-next-function – Helene May 13 '16 at 23:12
  • 43
    For anyone concerned about performance differences between `os.walk` and `os.listdir`+`os.path.isdir` solutions: I just tested on a directory with 10,000 subdirectories (with millions of files in the hierarchy below) and the performance differences are negligible. `os.walk`: "10 loops, best of 3: 44.6 msec per loop" and `os.listdir`+`os.path.isdir`: "10 loops, best of 3: 45.1 msec per loop" – kevinmicke Feb 28 '17 at 19:05
  • 5
    @kevinmicke try this performance test on a network drive, I think you'll find that the performance is rather significant in that case. – UKMonkey Nov 21 '17 at 12:53
  • @UKMonkey I'm sure you're right that a use case like that could have a significant difference. – kevinmicke Jun 02 '18 at 17:38
  • @UKMonkey: Actually, in 3.4 and earlier they should be roughly equivalent, and in 3.5 and higher `os.walk` should *beat* `os.listdir`+`os.path.isdir`, *especially* on network drives. Reasons: 1) `os.walk` is lazy; if you do `next(os.walk('.'))[1]` it performs a single directory listing & categorizing by dir/non-dir, and then goes away. The cost of setting up the generator is non-zero, but it's utterly unrelated to the cost of file system access. 2) As of 3.5, `os.walk` is implemented via `os.scandir`, which doesn't require per-entry `stat` calls to categorize dir/non-dir (aside from symlinks)… – ShadowRanger Jul 28 '22 at 12:17
  • …so, ignoring the case of symlinks, `os.walk` effectively requires just a single round trip to the storage medium to get the information on the whole directory (in practice, it's usually a buffered-by-blocks implementation under the hood, so a huge directory might require a number of round trips based on the number of entries divided by a large divisor, but `os.listdir`, implemented using the same system calls, does the same), while `os.listdir`+`os.path.isdir` pays that initial cost plus a per-entry `stat` system call. – ShadowRanger Jul 28 '22 at 12:20
  • `os.walk` will lose to properly written `os.scandir` usage (e.g. `[e.path for e in os.scandir('.') if e.is_dir()]` which is about as minimalist as you can get), but only because it has some extra overhead from the wrapping to allow recursion and separately store the non-dir `list`, neither of which gets used; it performs no additional system call work, so network drive or not, the biggest part of the expense (drive latency, and to a lesser extent, system call overhead) is still cheaper than `os.listdir`+`os.path.isdir` from 3.5 onwards. – ShadowRanger Jul 28 '22 at 12:24
318

You could just use glob.glob

from glob import glob
glob("/path/to/directory/*/", recursive = True)

Don't forget the trailing / after the *.

Nav
  • 19,885
  • 27
  • 92
  • 135
Udit Bansal
  • 3,329
  • 1
  • 10
  • 2
303

Much nicer than the above, because you don't need several os.path.join() and you will get the full path directly (if you wish), you can do this in Python 3.5 and above.

subfolders = [ f.path for f in os.scandir(folder) if f.is_dir() ]

This will give the complete path to the subdirectory. If you only want the name of the subdirectory use f.name instead of f.path

https://docs.python.org/3/library/os.html#os.scandir


Slightly OT: In case you need all subfolder recursively and/or all files recursively, have a look at this function, that is faster than os.walk & glob and will return a list of all subfolders as well as all files inside those (sub-)subfolders: https://stackoverflow.com/a/59803793/2441026

In case you want only all subfolders recursively:

def fast_scandir(dirname):
    subfolders= [f.path for f in os.scandir(dirname) if f.is_dir()]
    for dirname in list(subfolders):
        subfolders.extend(fast_scandir(dirname))
    return subfolders

Returns a list of all subfolders with their full paths. This again is faster than os.walk and a lot faster than glob.


An analysis of all functions

tl;dr:
- If you want to get all immediate subdirectories for a folder use os.scandir.
- If you want to get all subdirectories, even nested ones, use os.walk or - slightly faster - the fast_scandir function above.
- Never use os.walk for only top-level subdirectories, as it can be hundreds(!) of times slower than os.scandir.

  • If you run the code below, make sure to run it once so that your OS will have accessed the folder, discard the results and run the test, otherwise results will be screwed.
  • You might want to mix up the function calls, but I tested it, and it did not really matter.
  • All examples will give the full path to the folder. The pathlib example as a (Windows)Path object.
  • The first element of os.walk will be the base folder. So you will not get only subdirectories. You can use fu.pop(0) to remove it.
  • None of the results will use natural sorting. This means results will be sorted like this: 1, 10, 2. To get natural sorting (1, 2, 10), please have a look at https://stackoverflow.com/a/48030307/2441026


Results:

os.scandir      took   1 ms. Found dirs: 439
os.walk         took 463 ms. Found dirs: 441 -> it found the nested one + base folder.
glob.glob       took  20 ms. Found dirs: 439
pathlib.iterdir took  18 ms. Found dirs: 439
os.listdir      took  18 ms. Found dirs: 439

Tested with W7x64, Python 3.8.1.

# -*- coding: utf-8 -*-
# Python 3


import time
import os
from glob import glob
from pathlib import Path


directory = r"<insert_folder>"
RUNS = 1


def run_os_walk():
    a = time.time_ns()
    for i in range(RUNS):
        fu = [x[0] for x in os.walk(directory)]
    print(f"os.walk\t\t\ttook {(time.time_ns() - a) / 1000 / 1000 / RUNS:.0f} ms. Found dirs: {len(fu)}")


def run_glob():
    a = time.time_ns()
    for i in range(RUNS):
        fu = glob(directory + "/*/")
    print(f"glob.glob\t\ttook {(time.time_ns() - a) / 1000 / 1000 / RUNS:.0f} ms. Found dirs: {len(fu)}")


def run_pathlib_iterdir():
    a = time.time_ns()
    for i in range(RUNS):
        dirname = Path(directory)
        fu = [f for f in dirname.iterdir() if f.is_dir()]
    print(f"pathlib.iterdir\ttook {(time.time_ns() - a) / 1000 / 1000 / RUNS:.0f} ms. Found dirs: {len(fu)}")


def run_os_listdir():
    a = time.time_ns()
    for i in range(RUNS):
        dirname = Path(directory)
        fu = [os.path.join(directory, o) for o in os.listdir(directory) if os.path.isdir(os.path.join(directory, o))]
    print(f"os.listdir\t\ttook {(time.time_ns() - a) / 1000 / 1000 / RUNS:.0f} ms. Found dirs: {len(fu)}")


def run_os_scandir():
    a = time.time_ns()
    for i in range(RUNS):
        fu = [f.path for f in os.scandir(directory) if f.is_dir()]
    print(f"os.scandir\t\ttook {(time.time_ns() - a) / 1000 / 1000 / RUNS:.0f} ms.\tFound dirs: {len(fu)}")


if __name__ == '__main__':
    run_os_scandir()
    run_os_walk()
    run_glob()
    run_pathlib_iterdir()
    run_os_listdir()
poppie
  • 549
  • 5
  • 14
user136036
  • 11,228
  • 6
  • 46
  • 46
  • 2
    it would be nice if you mention early on your question where you are substituting the different functions you profile. Regardless, impressive you spent the time doing this. Good job. I personally prefer using a single library so I liked `using `pathlib` as follows `[f for f in p.iterdir() if f.is_dir()]`` – Charlie Parker Jul 20 '20 at 16:29
  • I have 50 subdirectories, each with thousands of subdirectories. I just tried running `fast_scandir` and it's taking over an hour. Is this normal? Is there anything I can do to speed it up? – Vincent Aug 18 '20 at 05:12
213
import os

d = '.'
[os.path.join(d, o) for o in os.listdir(d) 
                    if os.path.isdir(os.path.join(d,o))]
Wilfred Hughes
  • 29,846
  • 15
  • 139
  • 192
gahooa
  • 131,293
  • 12
  • 98
  • 101
  • 6
    note that in this approach you need to care of abspath-issues if not executed on '.' – daspostloch May 29 '11 at 23:26
  • 5
    Just a heads up, if you are not using the cwd ('.'), this will not work unless you do an `os.path.join` on `o` to get the full path, otherwise `isdir(0)` will always return false – James McMahon Aug 22 '12 at 20:32
  • 8
    It appears that the post has been updated with fixes for the two mentioned issues above. – cgmb Nov 12 '15 at 09:14
  • 3
    To avoid calling `os.path.join` twice, you can first join and then filter the list using `os.path.isdir`: `filter(os.path.isdir, [os.path.join(d, o) for o in os.listdir(d)])` – quant_dev Jun 14 '19 at 15:11
  • 2
    Using pathlib with `[f for f in data_path.iterdir() if f.is_dir()]` or glob is much simpler and easier to read: `glob.glob("/path/to/directory/*/")`. – Charlie Parker Jul 20 '20 at 16:32
81

Python 3.4 introduced the pathlib module into the standard library, which provides an object oriented approach to handle filesystem paths:

from pathlib import Path

p = Path('./')

# All subdirectories in the current directory, not recursive.
[f for f in p.iterdir() if f.is_dir()]

To recursively list all subdirectories, path globbing can be used with the ** pattern.

# This will also include the current directory '.'
list(p.glob('**'))

Note that a single * as the glob pattern would include both files and directories non-recursively. To get only directories, a trailing / can be appended but this only works when using the glob library directly, not when using glob via pathlib:

import glob

# These three lines return both files and directories
list(p.glob('*'))
list(p.glob('*/'))
glob.glob('*')

# Whereas this returns only directories
glob.glob('*/')

So Path('./').glob('**') matches the same paths as glob.glob('**/', recursive=True).

Pathlib is also available on Python 2.7 via the pathlib2 module on PyPi.

joelostblom
  • 43,590
  • 17
  • 150
  • 159
  • To iterate over the list of subdirectories, here is a nice, clean syntax: `for f in filter(Path.is_dir, p.iterdir()):` – Bryan Roach Oct 26 '19 at 08:00
  • Are you sure you need two stars for your glob solution? is `gloab(*/)` not sufficient? Regardless, fabulous answer, specially for your clean use of `pathlib`. It would be nice to comment if it also allows recursion, though from the title of the question that's not needed and future readers should read the docs you link. – Charlie Parker Jul 20 '20 at 16:26
  • 2
    Thank you @CharlieParker! I updated my answer with details about recursion and using a trailing slash (including noting that trailing slashes are not necessary when using `**` with pathlib's glob. Regarding, using a single asterisk, this would match files and directories non-recursively. – joelostblom Jul 20 '20 at 18:14
  • `glob.glob('**/', recursive=True)` won't include hidden directories, but `Path('./').glob('**')` does – nos Jul 05 '21 at 03:19
  • might add a `sorted()` at the start, so that the returned list is sorted...might or might not be useful depending on use case – Matias Andina Apr 20 '22 at 15:45
44

If you need a recursive solution that will find all the subdirectories in the subdirectories, use walk as proposed before.

If you only need the current directory's child directories, combine os.listdir with os.path.isdir

Eli Bendersky
  • 263,248
  • 89
  • 350
  • 412
35

Listing Out only directories

print("\nWe are listing out only the directories in current directory -")
directories_in_curdir = list(filter(os.path.isdir, os.listdir(os.curdir)))
print(directories_in_curdir)

Listing Out only files in current directory

files = list(filter(os.path.isfile, os.listdir(os.curdir)))
print("\nThe following are the list of all files in the current directory -")
print(files)
Martin Nowosad
  • 791
  • 8
  • 15
NutJobb
  • 403
  • 4
  • 3
  • 9
    Did not work on mac OS. I think that the problem is that os.listdir returns only the name of the directory and not the full path but os.path.isdir only returns True if the full path is a directory. – denson Nov 22 '16 at 17:52
  • 5
    This works outside of the current directory if you modify the line slightly: subdirs = filter(os.path.isdir, [os.path.join(dir,x) for x in os.listdir(dir)]) – RLC May 07 '19 at 17:15
  • nice job by avoiding to define lambda functions and just passing the functions directly. – Charlie Parker Jul 20 '20 at 16:29
  • Luckily as a workaround you can just call `isdir` outside the filter chain on Mac OS X. – Sridhar Sarnobat Mar 11 '22 at 06:02
  • Worked on m1 Mac as of 2023. Thank you! – BLimitless Aug 02 '23 at 16:11
29

I prefer using filter (https://docs.python.org/2/library/functions.html#filter), but this is just a matter of taste.

d='.'
filter(lambda x: os.path.isdir(os.path.join(d, x)), os.listdir(d))
svelten
  • 1,429
  • 13
  • 12
25

Implemented this using python-os-walk. (http://www.pythonforbeginners.com/code-snippets-source-code/python-os-walk/)

import os

print("root prints out directories only from what you specified")
print("dirs prints out sub-directories from root")
print("files prints out all files from root and directories")
print("*" * 20)

for root, dirs, files in os.walk("/var/log"):
    print(root)
    print(dirs)
    print(files)
vvvvv
  • 25,404
  • 19
  • 49
  • 81
Charith De Silva
  • 3,650
  • 4
  • 43
  • 47
18

You can get the list of subdirectories (and files) in Python 2.7 using os.listdir(path)

import os
os.listdir(path)  # list of subdirectories and files
Brian Burns
  • 20,575
  • 8
  • 83
  • 77
Oscar Martin
  • 473
  • 3
  • 2
14

Since I stumbled upon this problem using Python 3.4 and Windows UNC paths, here's a variant for this environment:

from pathlib import WindowsPath

def SubDirPath (d):
    return [f for f in d.iterdir() if f.is_dir()]

subdirs = SubDirPath(WindowsPath(r'\\file01.acme.local\home$'))
print(subdirs)

Pathlib is new in Python 3.4 and makes working with paths under different OSes much easier: https://docs.python.org/3.4/library/pathlib.html

14

Although this question is answered a long time ago. I want to recommend to use the pathlib module since this is a robust way to work on Windows and Unix OS.

So to get all paths in a specific directory including subdirectories:

from pathlib import Path
paths = list(Path('myhomefolder', 'folder').glob('**/*.txt'))

# all sorts of operations
file = paths[0]
file.name
file.stem
file.parent
file.suffix

etc.

Karim
  • 18,347
  • 13
  • 61
  • 70
Joost Döbken
  • 3,450
  • 2
  • 35
  • 79
12

Copy paste friendly in ipython:

import os
d='.'
folders = list(filter(lambda x: os.path.isdir(os.path.join(d, x)), os.listdir(d)))

Output from print(folders):

['folderA', 'folderB']
Andrew Schreiber
  • 14,344
  • 6
  • 46
  • 53
  • 2
    What is X in this case? – Abhishek Parikh Oct 28 '18 at 05:11
  • 1
    @AbhishekParikh `x` is the item from the list created by `os.listdir(d)` because `listdir` will return files and folders he is using the `filter` command with `os.path.isdir` to filter any files out from the list. – James Burke May 28 '19 at 18:35
11

Thanks for the tips, guys. I ran into an issue with softlinks (infinite recursion) being returned as dirs. Softlinks? We don't want no stinkin' soft links! So...

This rendered just the dirs, not softlinks:

>>> import os
>>> inf = os.walk('.')
>>> [x[0] for x in inf]
['.', './iamadir']
KurtB
  • 604
  • 7
  • 8
10

Here are a couple of simple functions based on @Blair Conrad's example -

import os

def get_subdirs(dir):
    "Get a list of immediate subdirectories"
    return next(os.walk(dir))[1]

def get_subfiles(dir):
    "Get a list of immediate subfiles"
    return next(os.walk(dir))[2]
Brian Burns
  • 20,575
  • 8
  • 83
  • 77
10

This is how I do it.

    import os
    for x in os.listdir(os.getcwd()):
        if os.path.isdir(x):
            print(x)
Mujeeb Ishaque
  • 2,259
  • 24
  • 16
9

Building upon Eli Bendersky's solution, use the following example:

import os
test_directory = <your_directory>
for child in os.listdir(test_directory):
    test_path = os.path.join(test_directory, child)
    if os.path.isdir(test_path):
        print test_path
        # Do stuff to the directory "test_path"

where <your_directory> is the path to the directory you want to traverse.

Blairg23
  • 11,334
  • 6
  • 72
  • 72
7

With full path and accounting for path being ., .., \\, ..\\..\\subfolder, etc:

import os, pprint
pprint.pprint([os.path.join(os.path.abspath(path), x[0]) \
    for x in os.walk(os.path.abspath(path))])
Max von Hippel
  • 2,856
  • 3
  • 29
  • 46
DevPlayer
  • 5,393
  • 1
  • 25
  • 20
7

The easiest way:

from pathlib import Path
from glob import glob

current_dir = Path.cwd()
all_sub_dir_paths = glob(str(current_dir) + '/*/') # returns list of sub directory paths

all_sub_dir_names = [Path(sub_dir).name for sub_dir in all_sub_dir_paths] 
Amir Afianian
  • 2,679
  • 4
  • 22
  • 46
  • I wouldn't call that easiest, and it's needlessly complicated to mix old-style `glob` module usage with new-style `pathlib` stuff (necessitating constant conversions between `str` and `Path` objects). If you want do demonstrate `pathlib` stuff, stick to `pathlib` where possible; it's frankly prettier anyway, and clearer to boot (vs. your code relying on that trailing `/` to make `glob.glob` not return files, a behavior `Path.glob` doesn't replicate): `all_sub_dir_names = [pth.name for pth in Path.cwd().iterdir() if pth.is_dir()]` – ShadowRanger Jul 28 '22 at 12:48
5

This answer didn't seem to exist already.

directories = [ x for x in os.listdir('.') if os.path.isdir(x) ]
Andrew
  • 3,733
  • 1
  • 35
  • 36
  • 10
    This will always return an empty list if you are searching anything other than the current working directory, which is technically what the OP is looking to do, but not very reusable. – ochawkeye Feb 23 '17 at 19:13
  • 3
    directories = [ x for x in os.listdir(localDir) if os.path.isdir(localDir+x) – Poonam Feb 15 '18 at 06:53
5

I've had a similar question recently, and I found out that the best answer for python 3.6 (as user havlock added) is to use os.scandir. Since it seems there is no solution using it, I'll add my own. First, a non-recursive solution that lists only the subdirectories directly under the root directory.

def get_dirlist(rootdir):

    dirlist = []

    with os.scandir(rootdir) as rit:
        for entry in rit:
            if not entry.name.startswith('.') and entry.is_dir():
                dirlist.append(entry.path)

    dirlist.sort() # Optional, in case you want sorted directory names
    return dirlist

The recursive version would look like this:

def get_dirlist(rootdir):

    dirlist = []

    with os.scandir(rootdir) as rit:
        for entry in rit:
            if not entry.name.startswith('.') and entry.is_dir():
                dirlist.append(entry.path)
                dirlist += get_dirlist(entry.path)

    dirlist.sort() # Optional, in case you want sorted directory names
    return dirlist

keep in mind that entry.path wields the absolute path to the subdirectory. In case you only need the folder name, you can use entry.name instead. Refer to os.DirEntry for additional details about the entry object.

Alberto A
  • 1,160
  • 4
  • 17
  • 35
  • Actually, the way this is written it will not work on 3.5, only 3.6. To use on 3.5 you need to remove context manager - see https://stackoverflow.com/questions/41401417/with-os-scandir-raises-attributeerror-exit – havlock Apr 01 '18 at 09:39
  • This is correct. I could swear I read somewhere that the context manager was implemented in 3.5, but It seems I'm wrong. – Alberto A Apr 02 '18 at 19:58
5

using os walk

sub_folders = []
for dir, sub_dirs, files in os.walk(test_folder):
    sub_folders.extend(sub_dirs)
vub
  • 119
  • 2
  • 3
3

This will list all subdirectories right down the file tree.

import pathlib


def list_dir(dir):
    path = pathlib.Path(dir)
    dir = []
    try:
        for item in path.iterdir():
            if item.is_dir():
                dir.append(item)
                dir = dir + list_dir(item)
        return dir
    except FileNotFoundError:
        print('Invalid directory')

pathlib is new in version 3.4

Yossarian42
  • 1,950
  • 17
  • 14
3

Function to return a List of all subdirectories within a given file path. Will search through the entire file tree.

import os

def get_sub_directory_paths(start_directory, sub_directories):
    """
    This method iterates through all subdirectory paths of a given 
    directory to collect all directory paths.

    :param start_directory: The starting directory path.
    :param sub_directories: A List that all subdirectory paths will be 
        stored to.
    :return: A List of all sub-directory paths.
    """

    for item in os.listdir(start_directory):
        full_path = os.path.join(start_directory, item)

        if os.path.isdir(full_path):
            sub_directories.append(full_path)

            # Recursive call to search through all subdirectories.
            get_sub_directory_paths(full_path, sub_directories)

return sub_directories
2

use a filter function os.path.isdir over os.listdir() something like this filter(os.path.isdir,[os.path.join(os.path.abspath('PATH'),p) for p in os.listdir('PATH/')])

2

This function, with a given parent directory iterates over all its directories recursively and prints all the filenames which it founds inside. Quite useful.

import os

def printDirectoryFiles(directory):
   for filename in os.listdir(directory):  
        full_path=os.path.join(directory, filename)
        if not os.path.isdir(full_path): 
            print( full_path + "\n")


def checkFolders(directory):

    dir_list = next(os.walk(directory))[1]

    #print(dir_list)

    for dir in dir_list:           
        print(dir)
        checkFolders(directory +"/"+ dir) 

    printDirectoryFiles(directory)       

main_dir="C:/Users/S0082448/Desktop/carpeta1"

checkFolders(main_dir)


input("Press enter to exit ;")

dbz
  • 411
  • 7
  • 22
2

we can get list of all the folders by using os.walk()

import os

path = os.getcwd()

pathObject = os.walk(path)

this pathObject is a object and we can get an array by

arr = [x for x in pathObject]

arr is of type [('current directory', [array of folder in current directory], [files in current directory]),('subdirectory', [array of folder in subdirectory], [files in subdirectory]) ....]

We can get list of all the subdirectory by iterating through the arr and printing the middle array

for i in arr:
   for j in i[1]:
      print(j)

This will print all the subdirectory.

To get all the files:

for i in arr:
   for j in i[2]:
      print(i[0] + "/" + j)
1

By joining multiple solutions from here, this is what I ended up using:

import os
import glob

def list_dirs(path):
    return [os.path.basename(x) for x in filter(
        os.path.isdir, glob.glob(os.path.join(path, '*')))]
SadSeven
  • 1,247
  • 1
  • 13
  • 20
1

Lot of nice answers out there but if you came here looking for a simple way to get list of all files or folders at once. You can take advantage of the os offered find on linux and mac which and is much faster than os.walk

import os
all_files_list = os.popen("find path/to/my_base_folder -type f").read().splitlines()
all_sub_directories_list = os.popen("find path/to/my_base_folder -type d").read().splitlines()

OR

import os

def get_files(path):
    all_files_list = os.popen(f"find {path} -type f").read().splitlines()
    return all_files_list

def get_sub_folders(path):
    all_sub_directories_list = os.popen(f"find {path} -type d").read().splitlines()
    return all_sub_directories_list
Pardhu
  • 1,789
  • 14
  • 17
  • wow, you just saved my life! (figuratively) I had a few folders but millions of files, and all of the methods described above was taking forever to execute, this is so much faster. – Malu05 Aug 18 '21 at 11:32
1

For anyone like me who just needed the names of the immediate folders within a directory this worked on Windows.

import os

for f in os.scandir(mypath):
    print(f.name)
Windy71
  • 851
  • 1
  • 9
  • 30
0

This should work, as it also creates a directory tree;

import os
import pathlib

def tree(directory):
    print(f'+ {directory}')
    print("There are " + str(len(os.listdir(os.getcwd()))) + \
    " folders in this directory;")
    for path in sorted(directory.glob('*')):
        depth = len(path.relative_to(directory).parts)
        spacer = '    ' * depth
        print(f'{spacer}+ {path.name}')

This should list all the directories in a folder using the pathlib library. path.relative_to(directory).parts gets the elements relative to the current working dir.

MLDev
  • 326
  • 4
  • 8
0

This below class would be able to get list of files, folder and all sub folder inside a given directory

import os
import json

class GetDirectoryList():
    def __init__(self, path):
        self.main_path = path
        self.absolute_path = []
        self.relative_path = []


    def get_files_and_folders(self, resp, path):
        all = os.listdir(path)
        resp["files"] = []
        for file_folder in all:
            if file_folder != "." and file_folder != "..":
                if os.path.isdir(path + "/" + file_folder):
                    resp[file_folder] = {}
                    self.get_files_and_folders(resp=resp[file_folder], path= path + "/" + file_folder)
                else:
                    resp["files"].append(file_folder)
                    self.absolute_path.append(path.replace(self.main_path + "/", "") + "/" + file_folder)
                    self.relative_path.append(path + "/" + file_folder)
        return resp, self.relative_path, self.absolute_path

    @property
    def get_all_files_folder(self):
        self.resp = {self.main_path: {}}
        all = self.get_files_and_folders(self.resp[self.main_path], self.main_path)
        return all

if __name__ == '__main__':
    mylib = GetDirectoryList(path="sample_folder")
    file_list = mylib.get_all_files_folder
    print (json.dumps(file_list))

Whereas Sample Directory looks like

sample_folder/
    lib_a/
        lib_c/
            lib_e/
                __init__.py
                a.txt
            __init__.py
            b.txt
            c.txt
        lib_d/
            __init__.py
        __init__.py
        d.txt
    lib_b/
        __init__.py
        e.txt
    __init__.py

Result Obtained

[
  {
    "files": [
      "__init__.py"
    ],
    "lib_b": {
      "files": [
        "__init__.py",
        "e.txt"
      ]
    },
    "lib_a": {
      "files": [
        "__init__.py",
        "d.txt"
      ],
      "lib_c": {
        "files": [
          "__init__.py",
          "c.txt",
          "b.txt"
        ],
        "lib_e": {
          "files": [
            "__init__.py",
            "a.txt"
          ]
        }
      },
      "lib_d": {
        "files": [
          "__init__.py"
        ]
      }
    }
  },
  [
    "sample_folder/lib_b/__init__.py",
    "sample_folder/lib_b/e.txt",
    "sample_folder/__init__.py",
    "sample_folder/lib_a/lib_c/lib_e/__init__.py",
    "sample_folder/lib_a/lib_c/lib_e/a.txt",
    "sample_folder/lib_a/lib_c/__init__.py",
    "sample_folder/lib_a/lib_c/c.txt",
    "sample_folder/lib_a/lib_c/b.txt",
    "sample_folder/lib_a/lib_d/__init__.py",
    "sample_folder/lib_a/__init__.py",
    "sample_folder/lib_a/d.txt"
  ],
  [
    "lib_b/__init__.py",
    "lib_b/e.txt",
    "sample_folder/__init__.py",
    "lib_a/lib_c/lib_e/__init__.py",
    "lib_a/lib_c/lib_e/a.txt",
    "lib_a/lib_c/__init__.py",
    "lib_a/lib_c/c.txt",
    "lib_a/lib_c/b.txt",
    "lib_a/lib_d/__init__.py",
    "lib_a/__init__.py",
    "lib_a/d.txt"
  ]
]
Saurabh Pandey
  • 519
  • 2
  • 15
  • This code throws an error: AttributeError: 'dict' object has no attribute 'append' ```AttributeError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_12328\702766901.py in () 32 #mylib = GetDirectoryList(path="sample_folder") 33 mylib = GetDirectoryList(os.getcwd()) ---> 34 file_list = mylib.get_all_files_folder 35 print (json.dumps(file_list))``` – Rich Lysakowski PhD Sep 23 '22 at 06:47
0
import os
path = "test/"
files = [x[0] + "/" + y for x in os.walk(path) if len(x[-1]) > 0 for y in x[-1]]
abhimanyu
  • 730
  • 1
  • 10
  • 23
0

it's simple recursive solution for it

import os
def fn(dir=r"C:\Users\aryan\Downloads\opendatakit"):  # 1.Get file names from directory
    file_list = os.listdir(dir)
    res = []
    # print(file_list)
    for file in file_list:
        if os.path.isfile(os.path.join(dir, file)):
                res.append(file)
        else:
            result = fn(os.path.join(dir, file))
            if result:
                res.extend(fn(os.path.join(dir, file)))
    return res


res = fn()
print(res)
print(len(res))
BugsCreator
  • 433
  • 3
  • 10
0

A lot of responses! After reviewing all the suggestions, I filtered out three candidates for listing all the folders in the tree and two methods for listing the immediate folders.

Every folder:

dirs_rglob = [x for x in folder.rglob('*') if x.is_dir()]
dirs_walk = [x[0] for x in os.walk(folder)]
dirs_custom = fast_scandir(folder)

where fast_scandir() is a custom function suggested by user136036, see my code below. I tested the performance with the following Python code:

from pathlib import Path
from os import walk,scandir
from time import monotonic

#user136036 custom code:
def fast_scandir(dirname):
    subfolders= [f.path for f in scandir(dirname) if f.is_dir()]
    for dirname in list(subfolders):
        subfolders.extend(fast_scandir(dirname))
    return subfolders

folder = Path('c:/xampp/htdocs/fonts') # Insert your path here 
msg = 'Using {}, seconds: {}, number of folders: {}.\n'

start = monotonic()
dirs_rglob = [x for x in folder.rglob('*') if x.is_dir()]
print(msg.format('Path.rglob', monotonic() - start, len(dirs_rglob)))

start = monotonic()
dirs_walk = [x[0] for x in walk(folder)]
print(msg.format('os.walk', monotonic() - start, len(dirs_walk)))

start = monotonic()
dirs_custom = fast_scandir(folder)
print(msg.format('fast_scandir', monotonic() - start, len(dirs_custom)))

As stated by user136036, his custom method is the fastest, but not much compared to os.walk(folder), while folder.rglob('*') is really very slow. Since the custom method is not that much faster, I use os.walk(folder), since the latter function has no problems with the Windows operating system for system folders or the Recycle Bin.

If you only need a list of immediate folders, you can use

dirs_iter = [ f for f in folder.iterdir()  if f.is_dir()]
dirs_scan = [ f.path for f in os.scandir(folder)  if f.is_dir()]

The elapsed times are both very small, with the folder.iterdir() method sometimes being slightly faster.

Bottom line: Use os.walk(folder) for the entire tree of subdirectories and folder.iterdir() or os.scandir(folder) for the immediate subdirectories.