0

My files in a directory are monthly data spanning several years, with characters like 0001-01-01, 0001-02-01, ..., 0005-01-01, ..., 0010-12-01 (yyyy-mm-dd) in the middle of each file name.

Now, I would like to exclude say the 0001* files. If I wrote sorted(glob.glob(mydirectory/filename-000[!1]*)) only gives me 0002 ~~ 0009 files, while the 0010 files are not included.

What should I do to only exclude the 0001* files?

If I wrote sorted(glob.glob(mydirectory/filename-000[2-9]*)) also only gives me 0002-0009 files, what should I do to include the 0010* files?

I also tried filename-{000[2-9],00[10-12]}*, which does not work.

Thanks,

James
  • 32,991
  • 4
  • 47
  • 70
Lin Lin
  • 25
  • 4

3 Answers3

0

Just add two globs together.

files = glob.glob(mydirectory/filename-000[!1]*) + glob.glob(mydirectory/filename-0010*)
James
  • 32,991
  • 4
  • 47
  • 70
0

Do it like this

import glob
from pprint import pprint

pprint(sorted(glob.glob("filename-???[!1]*")))

Where ? is "any character", like * is for "any characters string"

For me, this works pretty well.

(stackoverflow)  ~/PycharmProjects/stackoverflow  ls -la filename-00*
-rw-r--r--  1 dude  staff  0 Dec 11 23:31 filename-0001
-rw-r--r--  1 dude  staff  0 Dec 11 23:31 filename-0002
-rw-r--r--  1 dude  staff  0 Dec 11 23:31 filename-0003
-rw-r--r--  1 dude  staff  0 Dec 11 23:31 filename-0004
-rw-r--r--  1 dude  staff  0 Dec 11 23:31 filename-0005
-rw-r--r--  1 dude  staff  0 Dec 11 23:31 filename-0006
-rw-r--r--  1 dude  staff  0 Dec 11 23:31 filename-0007
-rw-r--r--  1 dude  staff  0 Dec 11 23:31 filename-0008
-rw-r--r--  1 dude  staff  0 Dec 11 23:31 filename-0009
-rw-r--r--  1 dude  staff  0 Dec 11 23:31 filename-0010
(stackoverflow)  ~/PycharmProjects/stackoverflow # python test123.py
['filename-0002',
 'filename-0003',
 'filename-0004',
 'filename-0005',
 'filename-0006',
 'filename-0007',
 'filename-0008',
 'filename-0009',
 'filename-0010']
(stackoverflow)  ~/PycharmProjects/stackoverflow #
Alexandr Shurigin
  • 3,921
  • 1
  • 13
  • 25
0

glob supports Unix shell pattern rules, but not more complex expressions. But, while you are in Python, you can use lots of filtering techniques, including regular expressions. If my directory looks like this:

$ find
.
./mydirectory
./mydirectory/filename-0001-01-01
./mydirectory/filename-0001-02-01
./mydirectory/filename-0010-12-01
./mydirectory/filename-0005-01-01

Then

[f for f in glob.glob(r"mydirectory/filename-*") if "0001" not in f]

will return:

['mydirectory/filename-0010-12-01', 'mydirectory/filename-0005-01-01']

This is further explained in this SO answer

gens
  • 972
  • 11
  • 22