I have a file that may be in a different place on each user's machine. Is there a way to implement a search for the file? A way that I can pass the file's name and the directory tree to search in?
-
2See the [os module](http://docs.python.org/library/os.html#files-and-directories) for os.walk or os.listdir See also this question https://stackoverflow.com/questions/229186/os-walk-without-digging-into-directories-below for sample code – Martin Beckett Nov 12 '09 at 19:24
9 Answers
os.walk is the answer, this will find the first match:
import os
def find(name, path):
for root, dirs, files in os.walk(path):
if name in files:
return os.path.join(root, name)
And this will find all matches:
def find_all(name, path):
result = []
for root, dirs, files in os.walk(path):
if name in files:
result.append(os.path.join(root, name))
return result
And this will match a pattern:
import os, fnmatch
def find(pattern, path):
result = []
for root, dirs, files in os.walk(path):
for name in files:
if fnmatch.fnmatch(name, pattern):
result.append(os.path.join(root, name))
return result
find('*.txt', '/path/to/dir')

- 111,714
- 37
- 173
- 152
-
3Note that these examples will only find files, not directories with the same name. If you want to find **any** object in the directory with that name you might want to use `if name in file or name in dirs` – Mark E. Hamilton Oct 17 '14 at 23:29
-
12Be careful of case sensitivity. `for name in files:` will fail looking for `super-photo.jpg` when it's `super-photo.JPG` in the file system. (an hour of my life I'd like back ;-) Somewhat messy fix is `if str.lower(name) in [x.lower() for x in files]` – matt wilkie Dec 16 '14 at 22:53
-
What about using **yield** instead of preparing the result list? ..... if fnmatch.fnmatch(name, pattern): yield os.path.join(root, name) – Berci May 03 '15 at 21:26
-
-
While this is the solution for Python, finding a file this way is about 5 times slower for me than forking and exec'ing `find`. Compare for yourself with: `os.popen('find / -name stdio.h').read().strip()` – Jun 27 '18 at 15:27
-
2Comprehention list can replace the function, e.g. find_all: res = [os.path.join(root, name) for root, dirs, files in os.walk(path) if name in files] – Nir Jul 27 '19 at 14:48
-
Is there a way to make it search for a pattern, but list only the first instance (ie, combination of first match and pattern match code)? – florence-y Nov 05 '19 at 22:22
-
@JoshuaDetwiler yeah but you're using python... who cares about performance? – hacksoi Apr 07 '20 at 19:03
In Python 3.4 or newer you can use pathlib to do recursive globbing:
>>> import pathlib
>>> sorted(pathlib.Path('.').glob('**/*.py'))
[PosixPath('build/lib/pathlib.py'),
PosixPath('docs/conf.py'),
PosixPath('pathlib.py'),
PosixPath('setup.py'),
PosixPath('test_pathlib.py')]
Reference: https://docs.python.org/3/library/pathlib.html#pathlib.Path.glob
In Python 3.5 or newer you can also do recursive globbing like this:
>>> import glob
>>> glob.glob('**/*.txt', recursive=True)
['2.txt', 'sub/3.txt']
Reference: https://docs.python.org/3/library/glob.html#glob.glob

- 595
- 4
- 11
-
glob working great. And we don't need to list a directory as a parameter if we are searching in the current directory – Liker777 Aug 14 '22 at 04:25
-
Excellent answer. It is also possible to glob for a relative path, e.g., `glob.glob('**/path/to/data/*.txt', recursive=True) -> ['/pwd/path/to/data/2.txt', '/pwd/path/to/data/sub/3.txt']` By default, `glob.glob()` uses `os.getcwd()` as root dir. Later versions of Python allow root dir override. Else, use `os.chdir()` to set PWD, then glob, glob, glob! – kevinarpe Nov 16 '22 at 08:16
I used a version of os.walk
and on a larger directory got times around 3.5 sec. I tried two random solutions with no great improvement, then just did:
paths = [line[2:] for line in subprocess.check_output("find . -iname '*.txt'", shell=True).splitlines()]
While it's POSIX-only, I got 0.25 sec.
From this, I believe it's entirely possible to optimise whole searching a lot in a platform-independent way, but this is where I stopped the research.

- 1,446
- 1
- 17
- 18
If you are using Python on Ubuntu and you only want it to work on Ubuntu a substantially faster way is the use the terminal's locate
program like this.
import subprocess
def find_files(file_name):
command = ['locate', file_name]
output = subprocess.Popen(command, stdout=subprocess.PIPE).communicate()[0]
output = output.decode()
search_results = output.split('\n')
return search_results
search_results
is a list
of the absolute file paths. This is 10,000's of times faster than the methods above and for one search I've done it was ~72,000 times faster.

- 3,558
- 5
- 39
- 49
If you are working with Python 2 you have a problem with infinite recursion on windows caused by self-referring symlinks.
This script will avoid following those. Note that this is windows-specific!
import os
from scandir import scandir
import ctypes
def is_sym_link(path):
# http://stackoverflow.com/a/35915819
FILE_ATTRIBUTE_REPARSE_POINT = 0x0400
return os.path.isdir(path) and (ctypes.windll.kernel32.GetFileAttributesW(unicode(path)) & FILE_ATTRIBUTE_REPARSE_POINT)
def find(base, filenames):
hits = []
def find_in_dir_subdir(direc):
content = scandir(direc)
for entry in content:
if entry.name in filenames:
hits.append(os.path.join(direc, entry.name))
elif entry.is_dir() and not is_sym_link(os.path.join(direc, entry.name)):
try:
find_in_dir_subdir(os.path.join(direc, entry.name))
except UnicodeDecodeError:
print "Could not resolve " + os.path.join(direc, entry.name)
continue
if not os.path.exists(base):
return
else:
find_in_dir_subdir(base)
return hits
It returns a list with all paths that point to files in the filenames list. Usage:
find("C:\\", ["file1.abc", "file2.abc", "file3.abc", "file4.abc", "file5.abc"])

- 1,929
- 3
- 23
- 42
Below we use a boolean "first" argument to switch between first match and all matches (a default which is equivalent to "find . -name file"):
import os
def find(root, file, first=False):
for d, subD, f in os.walk(root):
if file in f:
print("{0} : {1}".format(file, d))
if first == True:
break

- 134
- 1
- 9

- 669
- 8
- 12
The answer is very similar to existing ones, but slightly optimized.
So you can find any files or folders by pattern:
def iter_all(pattern, path):
return (
os.path.join(root, entry)
for root, dirs, files in os.walk(path)
for entry in dirs + files
if pattern.match(entry)
)
either by substring:
def iter_all(substring, path):
return (
os.path.join(root, entry)
for root, dirs, files in os.walk(path)
for entry in dirs + files
if substring in entry
)
or using a predicate:
def iter_all(predicate, path):
return (
os.path.join(root, entry)
for root, dirs, files in os.walk(path)
for entry in dirs + files
if predicate(entry)
)
to search only files or only folders - replace “dirs + files”, for example, with only “dirs” or only “files”, depending on what you need.
Regards.

- 121
- 1
- 4
SARose's answer worked for me until I updated from Ubuntu 20.04 LTS. The slight change I made to his code makes it work on the latest Ubuntu release.
import subprocess
def find_files(file_name):
command = ['locate'+ ' ' + file_name]
output = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True).communicate()[0]
output = output.decode()
search_results = output.split('\n')
return search_results

- 31
- 4
-
Python itself is able to find files without using subprocess to Unix commands – OneCricketeer Jul 31 '22 at 15:48
@F.M.F's answers has a few problems in this version, so I made a few adjustments to make it work.
import os
from os import scandir
import ctypes
def is_sym_link(path):
# http://stackoverflow.com/a/35915819
FILE_ATTRIBUTE_REPARSE_POINT = 0x0400
return os.path.isdir(path) and (ctypes.windll.kernel32.GetFileAttributesW(str(path)) & FILE_ATTRIBUTE_REPARSE_POINT)
def find(base, filenames):
hits = []
def find_in_dir_subdir(direc):
content = scandir(direc)
for entry in content:
if entry.name in filenames:
hits.append(os.path.join(direc, entry.name))
elif entry.is_dir() and not is_sym_link(os.path.join(direc, entry.name)):
try:
find_in_dir_subdir(os.path.join(direc, entry.name))
except UnicodeDecodeError:
print("Could not resolve " + os.path.join(direc, entry.name))
continue
except PermissionError:
print("Skipped " + os.path.join(direc, entry.name) + ". I lacked permission to navigate")
continue
if not os.path.exists(base):
return
else:
find_in_dir_subdir(base)
return hits
unicode() was changed to str() in Python 3, so I made that adjustment (line 8)
I also added (in line 25) and exception to PermissionError. This way, the program won't stop if it finds a folder it can't access.
Finally, I would like to give a little warning. When running the program, even if you are looking for a single file/directory, make sure you pass it as a list. Otherwise, you will get a lot of answers that not necessarily match your search.
example of use:
find("C:\", ["Python", "Homework"])
or
find("C:\\", ["Homework"])
but, for example: find("C:\\", "Homework") will give un-wanted answers.
I would be lying if I said I know why this happens. Again, this is not my code and I just made the adjustments I needed to make it work. All credit should go to @F.M.F.

- 179,855
- 19
- 132
- 245

- 11
- 2