2

I'm trying to use a while loop to loop through .xml files in a folder. However, files are being added to the folder whilst the while loop is running. This is a shortened version of the code I currently use:

import os

my_folder = "d:\\xml\\"

while True:
    files = [f for f in os.listdir(my_folder) if f.endswith(".xml")]
    while files:
        for file in files:
            # do whatever with the xml file
            os.remove(my_folder + file)
        files = [f for f in os.listdir(my_folder) if f.endswith(".xml")]

What I would like to do is tidy up the code by only having one line filling the files list. I'd like to have something like:

while files = [f for f in os.listdir(my_folder) if f.endswith(".xml")]:

But, I know this won't work. I would imagine that Python is capable of this, but I don't know the correct syntax.

Added note: I'm using Windows 10 with Python 3.7.6

  • Possibly related: https://stackoverflow.com/q/4708511/1639625 – tobias_k Nov 11 '20 at 14:38
  • 1
    Can't you remove the inner `while` and use just `while True: files = ...; for file in files: ...` ? – tobias_k Nov 11 '20 at 14:39
  • Since 3.8 there is the assignment operator `while files := [f for f...` – Michael Butscher Nov 11 '20 at 14:42
  • I like the two loops approach better. You could use a function to make the division of jobs more clear, but the general idea is the outer loop gets a one-time snapshot, processes it, and then does time.sleep. The inner loop works on that snapshot until done. – Kenny Ostrom Nov 11 '20 at 14:43
  • I suppose it would as it would just ignore the "for file in files" loop if there were no files... –  Nov 11 '20 at 14:44
  • @fosautoparts How does your real code terminate the outer while-loop? Or does it just wait forever for new files to be added? – ekhumoro Nov 11 '20 at 15:14
  • The real code has a while loop that checks for a flag to be triggered by pressing a button to terminate it. The reason I have a nested loop is that files are being added to the folder whilst the loop is running, and I want it to finish processing all files before outputting all the results to a file that my Excel spread sheet can use to get the results. I have had to offload API calls to Python as Excel VBA is not very good at multi threading (at least i don't know how to do it), and makes calls one by one which takes ages. With Python I make 100 calls in the same time as 1 with VBA. –  Nov 12 '20 at 06:26

1 Answers1

1

You could simplify your code by removing the inner while loop and the second assignment to files. This will loop indefinitely, see if there are xml files in the directory, and if so process and delete them, before continuing to loop. (You might also add a short sleep in case of no new files.)

while True:
    files = [f for f in os.listdir(my_folder) if f.endswith(".xml")]
    for file in files:
        # do whatever with the xml file
        os.remove(my_folder + file)
    

As shown in the other answer, you could also use the := operator and something like the following...

while True:
    while (files := [...]):
        ...

... but this would behave exactly the same as without the inner while. Only if you e.g. want to do something when there are temporarily no files left, i.e. have code in the outer loop that's not in the inner loop, this may make a difference.

tobias_k
  • 81,265
  • 12
  • 120
  • 179
  • I suppose you could do `while (files := [...]) or True:`. But the first solution still seems better - especially if `for path in glob(...):` is used instead. – ekhumoro Nov 11 '20 at 16:53
  • @ekhumoro That `while` with `:=` would just save one line; you'd still need the `for`, and you could do the same with moving the list comprehension directly to the `for` loop head. But using `for file in glob.glob("*.xml")` is indeed a nice idea that makes it more readable. – tobias_k Nov 11 '20 at 17:17