I have files that are named as follows:
- file-001-001.dat
- file-001-002.dat
- file-001-003.dat
- file-001-004.dat
file-001-005.dat
file-002-001.dat
- file-002-002.dat
- file-002-003.dat
file-002-004.dat
file-003-001.dat
- file-003-002.dat
- file-003-003.dat
- file-003-004.dat
- file-003-005.dat
- file-003-006.dat
- file-003-007.dat
file-003-008.dat
file-999-010.dat
I am trying to count the number of files for the same first number, e.g. the code should give me the number of files starting with 001 as 5, 002 as 4,... 999 as 1.
I have managed to get it done using this code, that counts the files in 'file_count' folder:
from collections import Counter
import numpy as np
import os
import re
data_folders = []
data_files = []
for root, directories, files in sorted(os.walk('./file_count')):
files = sorted([f for f in files if os.path.splitext(f)[1] in ('.dat,')])
for file in files:
data_folders.append(root)
data_files.append((re.findall(r"[-+]?\d*\.\d+|\d+", file)[-2].zfill(3), re.findall(r"[-+]?\d*\.\d+|\d+", \
file)[-1].zfill(3), os.path.join(root, file)))
data_folders = np.unique(data_folders)
data_files = sorted(data_files)
a = np.array(data_files)
print a[:, 0]
c = Counter(a[:, 0])
print c['001']
Is there a much simpler and efficient code than this? Any built in function that can solve this?