0

I have written a Python package hwrt which comes with a model.tar file. Neither the file itself nor the contents should bother users (they can take a look at it, it's not a secret - but they don't need it).

I need to work with files in that archive. To do so, I extract the contents and work with them:

import tarfile
tar = tarfile.open('model.tar in python package')
tar.extractall()
tar.close()
[... work with those files ...]

This works. However, I see two problems with this approach:

  1. The model.tar get extracted every time.
  2. The contained files get placed in the current working directory which might not be expected by the user.

To solve problem (2), it seems to me that creating a temporary folder in tempfile.gettempdir() would be the cleanest solution. However, I am not sure if that is the place where this should be done.

Are there other archive formats where I can access the contents of the archive directly? Is there a directory where I can place files for my Python package (for all users) after it was installed?

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
  • Please use the `with` statement for opening files. – Veedrac Dec 05 '14 at 13:14
  • `tempfile.gettempdir` seems fine to me, but why can't you just extract the tarfile in the place where the tarfile *is*? – Veedrac Dec 05 '14 at 13:15
  • ``model.tar`` contents are really temporary? As far as I understand you will access such contents quite often. Am I right? – Fabio Menegazzo Dec 05 '14 at 13:26
  • @Veedrac: I am not sure where Python packages get placed and if the program has the permission to write there. If that is the case, then this would be the answer to my question. – Martin Thoma Dec 05 '14 at 13:30
  • @Menegazzo: Yes, you are correct. `model.tar` does not change. I might access the contents several hundred (or thousands) times per program execution, but at least once. However, I read the contents only once (their size is about ~10MB). – Martin Thoma Dec 05 '14 at 13:32

1 Answers1

0

If the file is only ~10MB, just read it into RAM. Your later thousands of accesses will thank you.

If you want to only unpack the first time, you have several options:

  • Unpack at install time into the source directory

  • Unpack once into a local .cache directory

If you can bear unpacking once per run, use tempfile.gettempdir.

Veedrac
  • 58,273
  • 15
  • 112
  • 169
  • > "If the file is only ~10MB, just read it into RAM. Your later thousands of accesses will thank you." That is what I do. – Martin Thoma Dec 05 '14 at 13:54
  • > "Unpack at install time into the source directory" Is it guaranteed that this is writable? – Martin Thoma Dec 05 '14 at 13:54
  • 1
    @moose This won't allow you to change the directory at runtime but you will be able to write the directory at install time because by definition you already have access to it. – Veedrac Dec 05 '14 at 14:06
  • Where should code be placed that gets executed at install time? – Martin Thoma Dec 05 '14 at 14:09
  • @moose http://stackoverflow.com/questions/1732619/packaging-resources-with-setuptools-distribute, https://pythonhosted.org/setuptools/setuptools.html#including-data-files – Veedrac Dec 05 '14 at 14:32