I want to treat many files as if they were all one file. What's the proper pythonic way to take [filenames] => [file objects] => [lines] with generators/not reading an entire file into memory?
We all know the proper way to open a file:
with open("auth.log", "rb") as f:
print sum(f.readlines())
And we know the correct way to link several iterators/generators into one long one:
>>> list(itertools.chain(range(3), range(3)))
[0, 1, 2, 0, 1, 2]
but how do I link multiple files together and preserve the context managers?
with open("auth.log", "rb") as f0:
with open("auth.log.1", "rb") as f1:
for line in itertools.chain(f0, f1):
do_stuff_with(line)
# f1 is now closed
# f0 is now closed
# gross
I could ignore the context managers and do something like this, but it doesn't feel right:
files = itertools.chain(*(open(f, "rb") for f in file_names))
for line in files:
do_stuff_with(line)
Or is this kind of what Async IO - PEP 3156 is for and I'll just have to wait for the elegant syntax later?