10

The question is an attempt to get the exact instruction on how to do that. There were few attempts before, which don't seem to be full solutions:

solution to move the file inside the package

solution to read as zip

accessing meta info via get_distribution

The task at hand is to read the information about the egg the program is running from. There are few ways as i understand:

  1. hard code the location of the egg and treat it as zip archive - will work, but not flexible enough, because it will need to be edited and recompiled in case if file is moved to another location

  2. use ResourceManager().resource_filename(__name__, filename) - this seems to be limited in the fact that i cannot access the file that is inside the egg, but not inside the package. notations like "../../EGG-INFO/PKG-INFO" in filename don't work giving KeyError. So no good either.

  3. use dist = pkg_resources.get_distribution("dist_name") and then use dist object to get information, but I cannot understand from the docs how should i specify my distribution name? It can't find it.

So, i'm looking for correct solution about using pkg_resources.get_distribution plus it would be nice to finally have a full solution to read any file from inside the egg.

Thanks!

Community
  • 1
  • 1
Eugene Sajine
  • 8,104
  • 3
  • 23
  • 28

2 Answers2

8

Setuptools/distribute/pkg_resources is designed to be a sort of transparent overlay to standard Python distutils, which are pretty limited and don't allow a good way of distributing code.

eggs are just a way of putting together a bunch of python files, data files, and metadata, somewhat similar to Java JARs - but python packages can be installed from source even without en egg (which is a concept which does not exist in the standard distribution).

So there two scenarios here: either you're a programmer which is trying to use some file inside a library, and in such case, in order to read any file from your distribution, you don't need its full path - you just need an open file object with its content, right? So you should do something like this:

from pkg_resources import resource_stream, Requirement
resource_stream(Requirement.parse("restez==0.3.2"), "restez/httpconn.py")

That will return an open, readable file of the file you requested from your package distribution. If it's a zipped egg, it will be automatically be extracted.

Please note that you should specify the package name inside (restez) because the distribution name may be different from the package (e.g. distribution Twisted then uses twisted package name). Requirements parsing use this syntax: http://setuptools.readthedocs.io/en/latest/pkg_resources.html#requirements-parsing

This should suffice - you shouldn't need to know the path of the egg once you know how to fetch files from inside the egg.

If you really want the full path and you're sure your egg is uncompressed, use resource_filename instead of resource_stream.

Otherwise, if you're building a "packaging tool" and need to access the contents of your package, be it an egg or anything, you'll have to do that yourself by hand, just like pkg_resources does (pkg_resources source) . There's not a precise API for "querying an egg content" because there's no use case for that. If you're a programmer just using a library, use pkg_resources just like I suggested. If you're building a packaging tool, you should know where to put your hands, and that's it.

Alan Franzoni
  • 3,041
  • 1
  • 23
  • 35
  • What is the rule of composing the name like "restez==0.3.2"? If i have an egg my_program_0.9.egg should i say "my_program==0.9"?? What if i don't have a version number in the file name? – Eugene Sajine Oct 29 '12 at 18:27
  • @EugeneSajine http://packages.python.org/distribute/pkg_resources.html#requirements-parsing. The version number is not in the filename, it's in the distribution metadata - what you've specified in setup.py. – Alan Franzoni Oct 29 '12 at 22:53
  • Are you saying that this way the file of interest does not have to be inside the package? Because p2 in my question suggests close approach but has exactly this limitation. Sorry, I cannot check by my self right now – Eugene Sajine Oct 30 '12 at 15:33
  • @EugeneSajine if you want to accesso PKG-INFO data, use the `pkginfo` library. I think you're making too many assumptions on implementation details of eggs - their system is designed to be even more transparent than I said to the programmer, you should *not* need to know whether a distribution is within a zipped egg, an uncompressed egg, or manually installed via bdist; an egg is just a way to deliver such files together, but the programmer is not supposed to precisely know *how* a library he's using is installed - that may change at any time. – Alan Franzoni Oct 31 '12 at 07:27
  • Seems like my question is probably not clear enough. All I want to do is for example to read a file that is inside an egg but not inside the package. For example inside the egg I have README.txt in the root of the egg. Not in package, not in PKG-INFO. How to read it from inside the same egg? Any approach similar to reading from java jars? – Eugene Sajine Oct 31 '12 at 15:32
  • @EugeneSajine `resource_stream(Requirement.parse("restez==0.3.2"), "README.txt")` but this is packaging-dependent. It might not work with non-eggs. – Alan Franzoni Nov 01 '12 at 12:52
  • 2 notes: 1. I can't import resource_stream directly, but rather use ResourceManager().resource_stream() 2. It seems to be not necessary to specify the version number in the Requirement, it works without that. I appreciate your time and effort, thanks a lot! – Eugene Sajine Nov 01 '12 at 19:22
  • 1
    Great explanation. However, it doesn't really work if you are using dependency library which requires a full path to a resource. In that case the only thing you can do is to use `resource_filename` which doesn't work if the egg is zipped – Emiliano Mar 11 '14 at 17:13
3

The zipimporter used to load a module can be accessed using the __loader__ attribute on the module, so accessing a file within the egg should be as simple as:

__loader__.get_data('path/within/the/egg')
mata
  • 67,110
  • 10
  • 163
  • 162
  • This usage here is not clear to me, could you please elaborate and may be provide fuller example? Plus it seems Alan has a good point of egg not necessarily being a zip – Eugene Sajine Oct 29 '12 at 18:29
  • yea, it's true, this isn't really about accessing files in eggs in general, but from zipped eggs, which are really just zipfiles you can put into your pythonpath. This is independant of setuptools/pkg_resources, which just offers a different way of working with eggs, but isn't neccessarily needed to do so. – mata Oct 29 '12 at 19:06