43

Is it possible to create a TarFile object in memory using a buffer containing the tar data without having to write the TarFile to disk and open it up again? We get the bytes sent over a socket.

Something like this:

import tarfile
byte_array = client.read_bytes()
tar = tarfile.open(byte_array) # how to do this?
# use "tar" as a regular TarFile object
for member in tar.getmembers():
    f = tar.extractfile(member)
    print(f)

Note: one of the reasons for doing this is that we eventually want to be able to do this with multiple threads simultaneously, so using a temp file might be overridden if two threads try to do it at the same time.

Thank you for any and all help!

Sefu
  • 2,404
  • 8
  • 42
  • 59

2 Answers2

54

BytesIO() from IO module does exactly what you need.

import tarfile, io
byte_array = client.read_bytes()
file_like_object = io.BytesIO(byte_array)
tar = tarfile.open(fileobj=file_like_object)
# use "tar" as a regular TarFile object
for member in tar.getmembers():
    f = tar.extractfile(member)
    print(f)
decaf
  • 698
  • 5
  • 8
  • 5
    might have to add the mode, depending on what you're doing (archive in an archive) tarfile.open(fileobj=byte_stream, mode='r:gz') – Alex Apr 04 '17 at 21:27
  • 2
    ^ exactly. I had gotten to `fileobj=file_like_object` and wasn't doing `mode=` just giving the mode which isn't valid python x0 (they're all positional arguments, but the docs show `"filename", "r:gz"` -- if you're using stuff from memory you need to `fileobj=..., mode=...` explicitly! – svenevs May 18 '17 at 21:40
11

Sure, something like this:

import io

io_bytes = io.BytesIO(byte_array)

tar = tarfile.open(fileobj=io_bytes, mode='r')

(Adjust mode to fit the format of your tar file, e.g. possibly `mode='r:gz', etc.)

Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214