Removing non-matching filenames from a list - Python

Question

I'm reading the files name from a directory. I am then chopping the output so I only have the first 10 characters from the filename, because of this is causes the output to have duplicates and file names I don't want included. This is my code:

import os

for path, subdirs, files in os.walk(r'C:\\Users\User\Documents'):
   for filename in files:
        f = os.path.join(path, filename)
        print (str(f)[25:35])

This returns a list like this:

NUMBER0001
NUMBER0002
NUMBER0003
XXXXXXXX11
XXXXXXXX11
XXXXXXXX11

Expected Output:

NUMBER0001
NUMBER0002
NUMBER0003

How do I remove the files that don't start with 'NUMBER' from the list or how do I remove files that have XXX at the start?

Also, is it possible to sort the output so it is in order of creation date?(The numbers at the end of the file name do not reference the order that the files were created).

Not sure I'm clear - are we removing *duplicates*, or names that *don't match the specified pattern*? — Karl Knechtel, Jan 16 '19 at 08:45

score 1 · Accepted Answer · answered Jan 16 '19 at 08:47

1

The below code is for taking only filenames which have the "NUMBER" prefix.

for path, subdirs, files in os.walk(r'C:\\Users\User\Documents'):
   for filename in files:
        if 'NUMBER' in filename:
            f = os.path.join(path, filename)
            print (str(f)[25:35])

answered Jan 16 '19 at 08:47

gireesh4manu

109
2
12

.startswith method appears to be better suited for this purpose, to avoid cases of "XXXXNUMBER001" Source: https://stackoverflow.com/questions/7539959/finding-whether-a-string-starts-with-one-of-a-lists-variable-length-prefixes – Kryesec Jan 16 '19 at 09:12
@Kryesec I agree. Although I thought an "if" condition would suffice as per the list that JackU has shown us. – gireesh4manu Jan 16 '19 at 09:14

Removing non-matching filenames from a list - Python

1 Answers1