4

I work on a windows machine and want to check if a directory on a network path is empty.

The first thing that came to mind was calling os.listdir() and see if it has length 0.

i.e

def dir_empty(dir_path):
    return len(os.listdir(dir_path)) == 0

Because this is a network path where I do not always have good connectivity and because a folder can potentially contain thousands of files, this is a very slow solution. Is there a better one?

Marios
  • 26,333
  • 8
  • 32
  • 52
rob
  • 239
  • 2
  • 10
  • 1
    I actually don't think that is a duplicate. I want to know the answer in python, not in shell – rob Sep 17 '19 at 06:57
  • 1
    https://stackoverflow.com/questions/49284015/how-to-check-if-folder-is-empty-with-python#49284243 – Sid Sep 17 '19 at 07:05

6 Answers6

6

The fastest solution I found so far:

def dir_empty(dir_path):
    return not any((True for _ in os.scandir(dir_path)))

Or, as proposed in the comments below:

def dir_empty(dir_path):
    return not next(os.scandir(dir_path), None)

On the slow network I was working on this took seconds instead of minutes (minutes for the os.listdir() version). This seems to be faster, as the any statement only evaluates the first True statement.

rob
  • 239
  • 2
  • 10
  • This iterates through every single file in dir_path. Instead try: `return not next(os.scandir(dirpath), None)` – SurpriseDog Mar 18 '21 at 02:36
  • `[True for _ in os.scandir(dir_path)]` creates a list comprehension in memory that looks like: `[True, True, True, True]` (one hit for each scandir entry) then `any` goes over the list of Trues – SurpriseDog Mar 19 '21 at 17:04
  • https://stackoverflow.com/questions/47789/generator-expressions-vs-list-comprehensions – SurpriseDog Mar 19 '21 at 17:05
  • 1
    ups, weird that it actually improved speed for me then. Will adjust the answer. Thank you for pointing this out – rob Mar 29 '21 at 15:59
4

From Python 3.4 onwards you can use pathlib.iterdir() which will yield path objects of the directory contents:

>>> from pathlib import Path
>>>
>>> def dir_empty(dir_path):
...     path = Path(dir_path)
...     has_next = next(path.iterdir(), None)
...     if has_next is None:
...             return True
...     return False
Abdul Niyas P M
  • 18,035
  • 2
  • 25
  • 46
3

listdir gives a list. scandir gives an iterator, which may be more performant.

def dir_empty(dir_path):
    try:
        next(os.scandir(dir_path))
        return False
    except StopIteration:
        return True
Amadan
  • 191,408
  • 23
  • 240
  • 301
  • definitely more readable as I can guess what each line does whereas I have to know how any works in `def check_empty_by_scandir(path): \n with os.scandir(path) as it: \n return not any(it)` – DangerMouse Oct 18 '22 at 07:21
3

Since the OP is asking about the fastest way, I thought using os.scandir and returns as soon as we found the first file should be the fastest. os.scandir returns an iterator. We should avoid creating a whole list just to check if it is empty.

The test directory contains about 100 thousands files:

from pathlib import Path    
import os

path = 'jav/av'
len(os.listdir(path))

>>> 101204

Then start our test:

def check_empty_by_scandir(path):
    with os.scandir(path) as it:
        return not any(it)
    
def check_empty_by_listdir(path):
    return not os.listdir(path)

def check_empty_by_pathlib(path):
    return not any(Path(path).iterdir())


%timeit check_empty_by_scandir(path)
>>> 179 µs ± 878 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit check_empty_by_listdir(path)
>>> 28 ms ± 185 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit check_empty_by_pathlib(path)
>>> 27.6 ms ± 140 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

As we can see, check_empty_by_listdir and check_empty_by_pathlib is about 155 times slower than check_empty_by_scandir. The result from os.listdir() and Path.iterdir() is identical because Path.iterdir() uses os.listdir() in the background, creating a whole list in memory.

Additionally, as people point out, reading os.stat is not an option, which returns 4096 on empty directories in linux.

David Pi
  • 142
  • 1
  • 2
1

On Windows OS there is PathIsDirectoryEmptyA. We can use it to check if folder is empty or not.

def is_dir_empty(path:str)->bool:
    import ctypes
    shlwapi = ctypes.OleDLL('shlwapi')
    return shlwapi.PathIsDirectoryEmptyA(path.encode('utf-8'))
NOUSER
  • 11
  • 1
-1

Using os.stat:

is_empty = os.stat(dir_path).st_size == 0

Using Python's pathlib:

from pathlib import Path

is_empty = Path(dir_path).stat().st_size == 0
MCC
  • 109
  • 1
  • 5