7

Create a zip file from a generator in Python? describes a solution for writing a .zip to disk from a bunch of files.

I have a similar problem in the opposite direction. I am being given a generator:

stream = attachment.iter_bytes()
print type(stream)

and I would love to pipe it to a tar gunzip file-like object:

b = io.BytesIO(stream)
f = tarfile.open(mode='r:gz', fileobj = b)
f.list()

But I can't:

<type 'generator'>
Error: 'generator' does not have the buffer interface

I can solve this in the shell like so:

$ curl --options http://URL | tar zxf - ./path/to/interesting_file

How can I do the same in Python under the given conditions?

Community
  • 1
  • 1
Virgil Gheorghiu
  • 493
  • 1
  • 4
  • 13

1 Answers1

4

I had to wrap the generator in a file-like object built on top of the io module.

def generator_to_stream(generator, buffer_size=io.DEFAULT_BUFFER_SIZE):
    class GeneratorStream(io.RawIOBase):
        def __init__(self):
            self.leftover = None

        def readable(self):
            return True

        def readinto(self, b):
            try:
                l = len(b)  # : We're supposed to return at most this much
                chunk = self.leftover or next(generator)
                output, self.leftover = chunk[:l], chunk[l:]
                b[:len(output)] = output
                return len(output)
            except StopIteration:
                return 0  # : Indicate EOF
    return io.BufferedReader(GeneratorStream())

With this, you can open the tar file and extract its content.

stream = generator_to_stream(any_stream)
tar_file = tarfile.open(fileobj=stream, mode='r|*')
#: Do whatever you want with the tar_file now

for member in tar_file:
    member_file = tar_file.extractfile(member)
Roberto Soares
  • 244
  • 1
  • 11
  • 1
    Thanks Roberto ! Important to underline that you used the mode `'r|*'` and not `'r:*'` on tarfile.open(), otherwise you'd get an "io.UnsupportedOperation: seek" exception. – Martin Sep 07 '18 at 20:46