I have a directory with several hundred thousand files in it.
They all follow this format:
datetime_fileid_metadata_collect.txt
A specific example looks like this :
201405052359559_0002230255_35702088_collect88.txt
I am trying to write a script that pulls out and copies individual files when all I provide it is a list of file ids.
For example I have a text document fileids.txt that constains this
fileids.txt
0002230255
0001627237
0001023000
This is the example script I have written so far. file1 result keeps returning []
import os
import re, glob, shutil
base_dir = 'c:/stuff/tub_0_data/'
destination = 'c:/files_goes_here'
os.chdir(base_dir)
text_file = open('c:/stuff/fileids.txt', 'r')
file_ids = text_file.readlines()
#file_ids = [stripped for stripped in (line.strip() for line in text_file.readlines()) if stripped]
for ids in file_ids:
id1 = ids.rstrip()
print 'file id = ',str(id1)
file1 = glob.glob('*' + str(id1) + '*')
print str(file1)
if file1 != []:
shutil.copy(base_dir + file1, destination)
I know I dont fully understand glob or regular expressions yet. What would I put there if I want to find files based off of a specific string of their filename?
EDIT:
glob.glob('*' + stuff '*')
worked for finding things within the filename. Not removing linespace was the issue.