Here's the issue I'm running into:
Error: iterator should return strings, not bytes (did you open the file in text mode?)
The code that's causing this looks something like:
for fileinfo in tarfile.open(filename):
f = t.extractfile(fileinfo)
reader = csv.DictReader(f)
reader.fieldnames
The trouble seems to be that the extractfile()
method produces a io.BufferedReader that is a very basic file-like object and has no high-level text interface.
What would be a good way to handle this?
I'm thinking of looking at decoding the bytes from the reader into text but I need to retain streaming because these files are very large. The codebase is Python 3.6 running on Docker/Linux.