Two regexes:
\d{4}(?:\d?|\d{2})(?:\d?|\d{2})\.faifb1p16m2\.nc
\d{8}\.faifb1p16m2\.nc
Sample data:
- 20140131.faifb1p16m2.nc
- 2014131.faifb1p16m2.nc
- 201412.faifb1p16m2.nc
- 201411.faifb1p16m2.nc
- 20141212.faifb1p16m2.nc
- 2014121.faifb1p16m2.nc
- 201411.faifb1p16m2.nc
The first regex will match all 7 of those entries. The second regex will match only 1, and 5. I probably made the regexes way more complicated than I needed to.
You're going to want the second regex, but I'm just listing the first as an example.
from glob import glob
import re
re1 = r'\d{4}(?:\d?|\d{2})(?:\d?|\d{2})\.faifb1p16m2\.nc'
re2 = r'\d{8}\.faifb1p16m2\.nc'
l = [f for f in glob('*.faifb1p16m2.nc') if re.search(re1, f)]
m = [f for f in glob('*.faifb1p16m2.nc') if re.search(re2, f)]
print l
print
print m
#Then, suppose you want to filter and select everything with '12' in the list m
print filter(lambda x: x[4:6] == '12', m)
As another similar solution shows you can ditch glob for os.listdir(), so:
l = [f for f in glob('*.faifb1p16m2.nc') if re.search(re1, f)]`
Becomes:
l = [f for f in os.listdir() if re.search(re1, f)]
And then the rest of the code is great. One of the great things about using glob is that you can use iglob
which is just like glob, but as an iterator, which can help with performance when going through a directory with lots of files.
One more thing, here's another stackoverflow post with an overview of python's infamous lambda feature. It's often used for the functions map
, reduce
, filter
, and so on.