4

I want to know whether it is a good practice to use yield for file handling,

So currently I use a function f()

f which iterates over a list of file objects and yields the file object.


files = []
for myfile in glob.glob(os.path.join(path, '*.ext')):
    f = open(myfile, 'r')
    files.append(f)

def f():
    for f_obj in files:
        yield f_obj
        f_obj.close()

for f_obj in f():
    // do some processing on file.

Is this a proper way of handling files in Python ?

traintraveler
  • 351
  • 2
  • 9
  • 1
    What is `files`? Is it a list of strings (file paths) or are they already open file objects? – SyntaxVoid Aug 28 '19 at 15:58
  • Possible duplicate of [How to iterate over the file in python](https://stackoverflow.com/questions/5733419/how-to-iterate-over-the-file-in-python) – juanpa.arrivillaga Aug 28 '19 at 17:16
  • They are open file objects, I want to know whether it's safe to use yield like this, since the function f is suspended and it returns back to the next line. – traintraveler Aug 29 '19 at 04:02
  • This is too vague. I'd be a lot more worried about where you get `files` from, and how you manage the resources (as well as why you have an iterable of open handles in the first place). – Mad Physicist Aug 29 '19 at 04:10
  • As in say, files is a list of open file objects which I took by walking through a directory. I've added it to the code. – traintraveler Aug 29 '19 at 05:14
  • Note that in your code example, I believe `f_obj.close()` is only called for the first `f_obj` once the second `f_obj` is retrieved. So the last of your opened files will never be closed if you don't completely empty the generator. I have not tested this claim. – lucidbrot Apr 21 '23 at 11:05

1 Answers1

1

If you are reading a text file (eg CSV) the yield is quite appropriate to make your source a generator. A bit of a problem with your code is that f_obj is very generic, so that you do no accomplish as much with f() function.

I simple example with yield is reading CSV file - lines() turn you filename into a stream of individual lines. Similar thing here (aims to keep less data in memory).

import csv 

def lines(path):
    with open(path, 'r', newline='') as f:
        for line in csv.reader(f):
            yield line

from pathlib import Path
Path("file.txt").write_text("a,1,2\nb,20,40") 

gen = lines("file.txt")
print(next(gen))
print(next(gen))

In a generic way you originally present use of yield I think it lack intent and confuses a code reader why this is being done, but that is subjective.

Evgeny
  • 4,173
  • 2
  • 19
  • 39
  • What about binary files? Why are they not appropriate? – Mad Physicist Aug 29 '19 at 04:33
  • Why would't they be? My point is that with a binary file you would be opening it with 'rb', and dealing with item delimiters in a different way. I do not have an immidiate example of a sequence from binary file one may wrap into a generator, but that surely can be a case. – Evgeny Aug 29 '19 at 04:59