9

Is there any built-in or straightforward way to match paths recursively with double asterisk, e.g. like zsh does?

For example, with

path = 'foo/bar/ham/spam/eggs.py'

I can use fnmatch to test it with

fnmatch(path, 'foo/bar/ham/*/*.py'

Although, I would like to be able to do:

fnmatch(path, 'foo/**/*.py')

I know that fnmatch maps its pattern to regex, so in the words case I can roll my own fnmatch with additional ** pattern, but maybe there is an easier way

fish2000
  • 4,289
  • 2
  • 37
  • 76
Jakub M.
  • 32,471
  • 48
  • 110
  • 179
  • Something like `glob.glob`? – g.d.d.c Aug 20 '13 at 17:56
  • Here's a fork that allows fnmatch * and ** https://pypi.python.org/pypi/pywildcard – Alberto Galera Dec 04 '15 at 09:44
  • You may want to check out my [answer to a similar question](https://stackoverflow.com/a/72400344/5030772). It gives a slightly modified version of `fnmatch.translate()` that supports `**` wildcards, and prevents `*` from matching across directory boundaries. – Mathew Wicks May 27 '22 at 04:14

4 Answers4

8

If you look into fnmatch source code closely, it internally converts the pattern to a regular expression, mapping * into .* (and not [^/]* or similar) and thus does not care anything for directory separators / - unlike UNIX shells:

while i < n:
    c = pat[i]
    i = i+1
    if c == '*':
        res = res + '.*'
    elif c == '?':
        res = res + '.'
    elif c == '[':
        ...

Thus

>>> fnmatch.fnmatch('a/b/d/c', 'a/*/c')
True
>>> fnmatch.fnmatch('a/b/d/c', 'a/*************c')
True
  • 1
    A trap with that is that `foo/**/bar` will glob `foo/bar`, but `foo/.*/bar` won't match it. You can work around it by substituting `**/` in the original glob, converting to a regex (`fnmatch.translate`), then replacing your placeholder by `(.*/)*` which seems to be the semantics of `**/`. – Masklinn Mar 28 '23 at 09:04
8

For an fnmatch variant that works on paths, you can use a library called wcmatch which implements a globmatch function that matches a path with the same logic that glob crawls a filesystem with. You can control the enabled features with flags, in this case, we enable GLOBSTAR (using ** for recursive directory search).

>>> from wcmatch import glob
>>> glob.globmatch('some/file/path/filename.txt', 'some/**/*.txt', flags=glob.GLOBSTAR)
True
facelessuser
  • 1,656
  • 1
  • 13
  • 11
3

If you can live without using an os.walk loop, try:

glob2

formic

I personally use glob2:

import glob2
files = glob2.glob(r'C:\Users\**\iTunes\**\*.mp4')

Addendum:

As of Python 3.5, the native glob module supports recursive pattern matching:

import glob
files = glob.iglob(r'C:\Users\**\iTunes\**\*.mp4', recursive=True) 
Fnord
  • 5,365
  • 4
  • 31
  • 48
  • The question is about working with path (which may be virtual), while glob is working with scanning real filesystem. – Victor Gavro Nov 17 '21 at 13:45
1

This snippet adds compatibility for **

import re
from functools import lru_cache
from fnmatch import translate as fnmatch_translate


@lru_cache(maxsize=256, typed=True)
def _compile_fnmatch(pat):
    # fixes fnmatch for recursive ** (for compatibilty with Path.glob)
    pat = fnmatch_translate(pat)
    pat = pat.replace('(?s:.*.*/', '(?s:(^|.*/)')
    pat = pat.replace('/.*.*/', '.*/')
    return re.compile(pat).match


def fnmatch(name, pat):
    return _compile_fnmatch(str(pat))(str(name)) is not None
Victor Gavro
  • 1,347
  • 9
  • 13