19

I try to loop over all files matching a certain extension, including those inside hidden folders. So far I haven't found a way to do this with iglob. This works for all folder except those starting with a dot:

import glob
for filename in glob.iglob('/path/**/*.ext', recursive=True):
    print(filename)

I have tried to add the dot as an optional character to no avail. I'd really like to use glob instead of residing to os.walk.

How to include all files/folders, even those starting with ., with glob?

wjandrea
  • 28,235
  • 9
  • 60
  • 81
SeeDoubleYou
  • 1,253
  • 2
  • 15
  • 25

5 Answers5

12

I had this same issue and wished glob.glob had an optional parameter to include dot files. I wanted to be able to include ALL dot files in ALL directories including directories that start with dot. Its just not possible to do this with glob.glob. However I found that Python has pathlib standard module which has a glob function which operates differently, it will include dot files. The function operates a little differently, in particular it does not return a list of strings, but instead path objects. However I used the following

files=[]
file_refs = pathlib.Path(".").glob(pattern)
for file in file_refs:
    files.append(str(file))

The other noticeable difference I found was a glob pattern ending with **. This returned nothing in the pathlib version but would return all the files in the glob.glob one. To get the same results I added a line to check if the pattern ended with ** and if so then append /* to it.

The following code is a replacement for your example that include the files in directories starting with dot

import pathlib
for fileref in pathlib.Path('/path/').glob('**/*.ext'):
    filename = str(fileref)
    print(filename)
waterjuice
  • 829
  • 5
  • 14
8

From https://docs.python.org/3/library/glob.html

Note that unlike fnmatch.fnmatch(), glob treats filenames beginning with a dot (.) as special cases

If the directory contains files starting with . they won’t be matched by default. For example, consider a directory containing card.gif and .card.gif:

import glob  
glob.glob('*.gif') # ['card.gif']  
glob.glob('.c*') # ['.card.gif']

From what I see it requires two separate globs to get both hidden and not hidden ones, for example using https://stackoverflow.com/a/4829130/4130619.

Basj
  • 41,386
  • 99
  • 383
  • 673
reducing activity
  • 1,985
  • 2
  • 36
  • 64
5

Adding an answer for the bounty question; getting the result of hidden and non-hidden files in a single command.

As @reducidng activity mentioned, glob treats . files as a special use-case. To get both regular and hidden files in a single loop, we can use itertools.chain with glob.iglob iterators. for example,

→ ls -A
.chen     file.text so1.py

>>> import glob, itertools
>>> for i in itertools.chain(glob.iglob('**'), glob.iglob('.**')):
...     print(i)
...
file.text
so1.py
.chen

# If you want it as a variable, you can list() it.
>>> l = list(itertools.chain(glob.iglob('**'), glob.iglob('.**')))
>>> l
['file.text', 'so1.py', '.chen']
>>>

Note: it does not fully work (yet). Let's say you have .hello, .dot/hello.txt, .dot/.hello.txt, nodot/hello.txt, nodot/.hello.txt. Then neither:

itertools.chain(glob.iglob('**', recursive=True), glob.iglob('.**', recursive=True))

nor

itertools.chain(glob.iglob('**/*', recursive=True), glob.iglob('.**/*', recursive=True))

give all files.

Basj
  • 41,386
  • 99
  • 383
  • 673
Chen A.
  • 10,140
  • 3
  • 42
  • 61
  • @Basj `recursive` is a function parameter of `iglob`, so yes it works. – Chen A. Dec 09 '20 at 09:51
  • It does not fully work @ChenA.: let's say you have `.hello`, `.dot/hello.txt`, `.dot/.hello.txt`, `nodot/hello.txt`, `nodot/.hello.txt`. Then `itertools.chain(glob.iglob('**', recursive=True), glob.iglob('.**', recursive=True))` does not take all of them. – Basj Dec 09 '20 at 09:56
  • I see. In this case you would need to iterate each directory. – Chen A. Dec 09 '20 at 12:32
  • 1
    I added this important note (for future readers) in your answer @ChenA. – Basj Dec 09 '20 at 12:39
4

From python 3.11 onward it is possible to do:

glob.iglob('/path/*', include_hidden=True)
-1

To find hidden files matching a certain extension, you can try this

glob.glob('/path/**/.*.ext')

If you want to find all files in a folder

glob.glob('/path/*') + glob.glob('/path/.*')
吴天宇
  • 11
  • 1