0

How to find all .yml and .yaml files using a single glob pattern? Desired output:

>>> import os, glob
>>> os.listdir('.')
['f1.yaml', 'f2.yml', 'f0.txt']
>>> glob.glob(pat)
['f1.yaml', 'f2.yml']

Attempts that don't work:

>>> glob.glob("*.ya?ml")
[]
>>> glob.glob("*.y[a]ml")
['f1.yaml']

Current workaround is globbing twice, but I want to know if this is possible with a single pattern.

>>> glob.glob('*.yaml') + glob.glob('*.yml')
['f1.yaml', 'f2.yml']

Not looking for more workarounds, if this is not possible with a single pattern then I'd like to see an answer like "glob can not find .yaml and .yml files with a single pattern because of these reasons..."

no step on snek
  • 324
  • 1
  • 15
  • 1
    Does this answer your question? [Regular expression usage in glob.glob?](https://stackoverflow.com/questions/13031989/regular-expression-usage-in-glob-glob) – JRiggles Aug 16 '23 at 17:14
  • glob can only handle glob patterns. If you need to use richer pattern matching, you can iterate the directory tree and filter results yourself. – Brian61354270 Aug 16 '23 at 17:30
  • @JRiggles No it doesn't. I ask if there is a glob/fnmatch pattern which matches files with .yml or .yaml extension, and the regex question doesn't talk about that. Of course you can match this with regex if you wanted to use regex... – no step on snek Aug 16 '23 at 21:43
  • Short answer then: no, there isn't. – JRiggles Aug 17 '23 at 11:31

3 Answers3

1

You can build a list containing all .yml and .yaml files like this:

files = []
for ext in ('*.yml', '*.yaml'):
    files.extend(glob.glob(ext))

If you want to recursively search all sub directories instead of just the current directory, you can use glob.glob(ext, recursive=True) instead.

Simon1
  • 445
  • 4
  • 12
  • 2
    This still calls glob.glob twice. It looks like just a different way to do the same thing as `glob.glob('*.yaml') + glob.glob('*.yml')` from the question. – no step on snek Aug 16 '23 at 21:41
  • 1
    There isn't way to use a single ``glob.glob()`` statement without the potential for finding incorrect file extensions like the other suggested answer. This suggestion is cleaner than writing ``glob.glob()`` twice and can be expanded easily. – Simon1 Aug 17 '23 at 13:12
1

Glob does not handle regular expression, but we can further refine its output. First, we would glob:

raw = glob("*.y*ml")

This will match *.yml, *.yaml; but also matches *.you-love-saml or *.yoml. The next step is to filter the raw list to our specifications:

result = [f for f in raw if f.endswith((".yml", ".yaml"))]
Hai Vu
  • 37,849
  • 11
  • 66
  • 93
0

Use glob.glob('*.y*ml'). This will only fail in the unlikely case that you have some other extensions that begin with y and end with ml (e.g. foo.yabcml).

Barmar
  • 741,623
  • 53
  • 500
  • 612