0

In python, I noticed if I'm iterating through a list with for x in y, and I remove an element of y in the loop, the last element will be "skipped" - I'm assuming this is because len(y) has changed.

I'm trying to grab all files with a particular extension, except those who meet some condition.

Here's the original code:

def test_print_numTXTs(fileList):
    counter = 0
    for file in fileList:
        if file.name[-4:] == ".txt":
            counter +=1
            if file.name == "a.txt":
                fileList.remove(file)   #problem caused here
    print(counter)
    print(len(fileList))

The output for counter is one less than the total number of .txt files. Stepping through the debugger, I can see it's skipping the last iteration of the loop (I'm assuming because len(fileList) is now -=1 w.r.t. its initial len().

The following code "works", but feels like a hack - I'm adding the files I'd like to remove from the list to a second list, then iterating over that after the fact. I've commented out the line I originally had, which caused the "skipping" of iteration(s).

def print_numTXTs(fileList):
    filesToRemoveFromList = []
    counter = 0
    for file in fileList:
        if file.name[-4:] == ".txt":
            counter +=1
            if file.name == "a.txt":
                #fileList.remove(file) #problem caused here
                filesToRemoveFromList.append(file)
    print(counter)
    for file in filesToRemoveFromList:
        fileList.remove(file)
    print(len(fileList))

This code outputs a count of all the .txt files, and the length of the list is one less than that (because the element a.txt was removed) - this is the desired behaviour.

Is there a more elegant solution to this problem?

Moley
  • 13
  • 1
  • 3

3 Answers3

2

You are right. You need an additional list. But there is an easier solution.

def print_numTXTs(fileList):

    counter = 0
    for file in list(fileList):
        if file.name[-4:] == ".txt":
            counter +=1
            if file.name == "a.txt":
                fileList.remove(file)
   

The secret is "list(fileList)". You creating an additional list and iterates over this.

Just as powerful are list compressions. In your example it should work like this. I have not tried now...only quickly written here.

fileList = [ file for file in fileList if file.name != "a.txt" ]
Nedy
  • 36
  • 1
0

I have proposition to ignore the last loop :

def test_print_numTXTs(fileList):
    counter = 0
    res = []
    for file in fileList:
        if file.name[-4:] == ".txt":
            counter +=1
            if file.name != "a.txt":
                res.append(file)   #problem caused here
    print(res)

This solution works. I am going to think if they are a more pythonic way.

0

Instead of manually filtering on files ending in .txt, you can glob for files matching this pattern

Say the folder foo containins the files:

a.txt  
b.txt  
c.txt 

And you want to count the nuber of *.txt files, except for a.txt

>>> from pathlib import Path
>>> file_list = Path('foo').glob('*.txt')
>>> sum(1 for f in file_list if f.name.endswith('.txt') and f.name != 'a.txt')
2
BioGeek
  • 21,897
  • 23
  • 83
  • 145