33

I've written a Python package that includes a bsddb database of pre-computed values for one of the more time-consuming computations. For simplicity, my setup script installs the database file in the same directory as the code which accesses the database (on Unix, something like /usr/lib/python2.5/site-packages/mypackage/).

How do I store the final location of the database file so my code can access it? Right now, I'm using a hack based on the __file__ variable in the module which accesses the database:

dbname = os.path.join(os.path.dirname(__file__), "database.dat")

It works, but it seems... hackish. Is there a better way to do this? I'd like to have the setup script just grab the final installation location from the distutils module and stuff it into a "dbconfig.py" file that gets installed alongside the code that accesses the database.

merwok
  • 6,779
  • 1
  • 28
  • 42
Paul
  • 331
  • 1
  • 4
  • 5
  • Minimal runnable published working example at: https://stackoverflow.com/questions/3596979/manifest-in-ignored-on-python-setup-py-install-no-data-files-installed/60735402#60735402 – Ciro Santilli OurBigBook.com Mar 18 '20 at 07:53

4 Answers4

36

Try using pkg_resources, which is part of setuptools (and available on all of the pythons I have access to right now):

>>> import pkg_resources
>>> pkg_resources.resource_filename(__name__, "foo.config")
'foo.config'
>>> pkg_resources.resource_filename('tempfile', "foo.config")
'/usr/lib/python2.4/foo.config'

There's more discussion about using pkg_resources to get resources on the eggs page and the pkg_resources page.

Also note, where possible it's probably advisable to use pkg_resources.resource_stream or pkg_resources.resource_string because if the package is part of an egg, resource_filename will copy the file to a temporary directory.

xgMz
  • 3,334
  • 2
  • 30
  • 23
Aaron Maenpaa
  • 119,832
  • 11
  • 95
  • 108
  • 3
    This makes your package depend on `setuptools`, which is not present in standard library. – anatoly techtonik Jan 17 '15 at 10:22
  • 5
    @techtonik Sadly, the two standard ways to access resources are using disutils or setuptools. There is no decent way to access resources in a module using only the standard libraries. This is partly due to the fact that it is hard to build a widely distributable module without external libraries. – john_science Sep 10 '15 at 15:44
  • 4
    This is now the righ answer since setuptools is part pip and recent Python have ensure_pip. – Bite code Feb 12 '16 at 10:09
  • 4
    Not anymore. Since Python-3.7 the standard way is to use `importlib.resources` standard library's module. – ankostis Nov 01 '19 at 09:22
20

Use pkgutil.get_data. It’s the cousin of pkg_resources.resource_stream, but in the standard library, and should work with flat filesystem installs as well as zipped packages and other importers.

merwok
  • 6,779
  • 1
  • 28
  • 42
  • 1
    I strongly prefer this over using pkg_resources, since pkgutil is always available, and because pkg_resources seems to trigger unpacking a zipped egg in the $PYTHON_EGG_CACHE directory, which kind of defeats the purpose of having zipped eggs... – Kenneth Hoste Sep 21 '15 at 20:15
3

That's probably the way to do it, without resorting to something more advanced like using setuptools to install the files where they belong.

Notice there's a problem with that approach, because on OSes with real a security framework (UNIXes, etc.) the user running your script might not have the rights to access the DB in the system directory where it gets installed.

dguaraglia
  • 5,774
  • 1
  • 26
  • 23
3

Use the standard Python-3.7 library's importlib.resources module, which is more efficient than setuptools:pkg_resources (on previous Python versions, use the backported importlib_resources library).

Attention: For this to work, the folder where the data-file resides must be a regular python-package. That means you must add an __init__.py file into it, if not already there.

Then you can access it like this:

try:
  import importlib.resources as importlib_resources
except ImportError:
  # In PY<3.7 fall-back to backported `importlib_resources`.
  import importlib_resources


## Note that the actual package could have been used, 
#  not just its (string) name, with something like: 
#      from XXX import YYY as data_pkg
data_pkg = '.'
fname = 'database.dat'

db_bytes = importlib_resources.read_binary(data_pkg, fname)
# or if a file-like stream is needed:
with importlib_resources.open_binary(data_pkg, fname) as db_file:
    ...
ankostis
  • 8,579
  • 3
  • 47
  • 61