9

Is there a possibility to create a file directly in a tar archive?
Context: I have a method which creates content of some kind as String. I want to save this content as a file in the tar archive. Do I have to create a tmpfile or is there a possibility to create a file directly in the tar archive.

def save_files_to_tar(tarname):
    archive = tarfile.open(tarname, mode='w')
    for _ in range(some_number):
        content = get_content()
        # HERE add content to tar
GoatsWearHats
  • 272
  • 1
  • 9
  • 17
PKuhn
  • 1,338
  • 1
  • 14
  • 30
  • Possible duplicate of [How to create full compressed tar file using Python?](http://stackoverflow.com/questions/2032403/how-to-create-full-compressed-tar-file-using-python) – Will Jun 01 '16 at 11:48
  • Since I don't need an tar.gz file this post does not apply – PKuhn Jun 01 '16 at 11:55
  • I do not see this as duplicate, as the focus is on avoiding creation of file just to be "Moved" into tar archive. I second @pacholik in "thinking" this is not obviously possible after `import tarfile` as (to cite from documentation) "A tar archive is a sequence of blocks. An archive member (a stored file) is made up of a header block followed by data blocks." also skimming the API shows no place, where the module accepts kind of a memory chunk /variable to put it directly as member into the archive. But as @Ronan suggests you may create a file like object via StingIO (or even cStringIO?). – Dilettant Jun 01 '16 at 12:00

2 Answers2

8

I think you should use StringIO, to create a file like in-memory object, and use tarInfo to describe a fake file, like so :

import StringIO
import tarfile

archive = tarfile.open(tarname, mode='w')
for _ in range(some_number):
    content = get_content()
    s = StringIO.StringIO()
    s.write(content)
    s.seek(0)
    tarinfo = tarfile.TarInfo(name="my filename")
    tarinfo.size = len(s.buf)
    archive.addfile(tarinfo=tarinfo, fileobj=s)

archive.close()

Hope this helps.

Ronan
  • 176
  • 1
  • 3
  • Of course you need to place most of that code in your save_files_to_tar function. – Ronan Jun 01 '16 at 11:59
  • May I suggest to also offer `cStringio`as it might be performance related when one wants to spare a temp file (besides elegance)? – Dilettant Jun 01 '16 at 12:02
  • True, cStringIO has exactly the same usage for better perfs. As long as you don't need custom behavior, you should use cStringIO like @Dilettant said. – Ronan Jun 01 '16 at 15:13
  • I guess, in Python 3 it should be something like `s = io.StringIO()`, `tarinfo.size = len(s.getvalue())` and `archive.addfile(tarinfo=tarinfo, fileobj=io.BytesIO(s.getvalue().encode('utf-8')))` – v_2e Jan 05 '18 at 10:24
  • For those looking for the image equivalent code: im = PIL.Image.fromarray(some_array) s = BytesIO() s.write(im.tobytes()) s.seek(0) tarinfo = tarfile.TarInfo(name = "imagename" + ".jpg") tarinfo.size = len(s.getvalue()) tarfile.addfile(tarinfo=tarinfo, fileobj=BytesIO(s.getvalue())) (I don't know how to put the comment prettier, forgive me.) – David M. Sousa Feb 12 '20 at 12:06
4

A solution that is Python 2 & 3 compatible using context managers:

from __future__ import unicode_literals

import tarfile
import time
from contextlib import closing
from io import BytesIO

message = 'Lorem ipsum dolor sit amet.'
filename = 'test.txt'

with tarfile.open('test.tar', 'w') as tf:
    with closing(BytesIO(message.encode())) as fobj:
        tarinfo = tarfile.TarInfo(filename)
        tarinfo.size = len(fobj.getvalue())
        tarinfo.mtime = time.time()
        tf.addfile(tarinfo, fileobj=fobj)
Cas
  • 6,123
  • 3
  • 36
  • 35