7

Suppose I have the dict of a module (via vars(mod), or mod.__dict__, or globals()), e.g.:

import mod

d = vars(mod)

Given the dict d, how can I get back the module mod? I.e. I want to write a function get_mod_from_dict(d), which returns the module if the dict belongs to a module, or None:

>>> get_mod_from_dict(d)
<module 'mod'>

If get_mod_from_dict returns a module, I must have that this holds:

mod = get_mod_from_dict(d)
assert mod is None or mod.__dict__ is d

I actually can implement it like this:

def get_mod_from_dict(d):
    mods = {id(mod.__dict__): mod for (modname, mod) in sys.modules.items()
                                  if mod and modname != "__main__"}
    return mods.get(id(d), None)

However, this seems inefficient to me, to iterate through sys.modules.

Is there a better way?


Why do I need this?

  • In some cases, you get access to the dict only. E.g. in the stack frames. And then, depending on what you want to do, maybe just for inspection/debugging purpose, it is helpful to get back the module.

  • I wrote some extension to Pickler which can pickle methods, functions, etc. Some of these have references to the module, or the module dict. Wherever I have a dict which belongs to a module during pickling, I don't want to pickle the dict, but instead a reference to the module.

Albert
  • 65,406
  • 61
  • 242
  • 386

4 Answers4

5

Every module has a __name__ attribute that uniquely identifies the module in the import system:

>>> import os
>>> os.__name__
'os'
>>> vars(os)['__name__']
'os'

Imported modules are also cached in sys.modules, which is a dict mapping module names to module instances. You can simply look up the module's name there:

import sys

def get_mod_from_dict(module_dict):
    module_name = module_dict['__name__']
    return sys.modules.get(module_name)

Some people have expressed concern that this might not work for (sub-)modules in packages, but it does:

>>> import urllib.request
>>> get_mod_from_dict(vars(urllib.request))
<module 'urllib.request' from '/usr/lib/python3.7/urllib/request.py'>

There is a very minor caveat, though: This will only work for modules that have been properly imported and cached by the import machinery. If a module has been imported with tricks like How to import a module given the full path?, it might not be cached in sys.modules and your function might then unexpectedly return None.

Aran-Fey
  • 39,665
  • 11
  • 104
  • 149
  • You mean "not cached in `sys.modules`", right? In this case, there is not good way to get the module at all? – Albert May 17 '19 at 08:03
  • @Albert Just curious, in what case you want to apply this import? – knh190 May 17 '19 at 08:05
  • @Albert Oops, typo. But no, I don't think there's a good way to find a module that's not cached in `sys.modules`. The dict doesn't hold a reference to its module, so I *think* the only way to do it would be to iterate over *all* objects in memory with [`gc.get_objects`](https://docs.python.org/3/library/gc.html#gc.get_objects) and check if any of them are the module with your dict. Definitely not a clean or pretty solution. – Aran-Fey May 17 '19 at 08:06
  • @knh190: I added some example usages to my question. – Albert May 17 '19 at 08:11
  • Yes, I also thought about `gc`. Actually `gc.get_referrers` is probably better. I posted that as another answer. – Albert May 17 '19 at 08:40
4

You can use importlib.import_module to import a module given it's name. Example for numpy


In [77]: import numpy 
    ...: import importlib                                                                                                                                                                               

In [78]: d = vars(numpy)                                                                                                                                                                                

In [79]: np = importlib.import_module(d['__name__'])                                                                                                                                                    

In [80]: np.array([1,2,3])                                                                                                                                                                              
Out[80]: array([1, 2, 3])
Devesh Kumar Singh
  • 20,259
  • 5
  • 21
  • 40
  • I can't believe I did not think about this! since the module's dict hold the module's name, I think this definitely the good way of doing this. – olinox14 May 16 '19 at 15:38
  • I don't understand how this is related to my question. I don't have the name, I have the dict. – Albert May 16 '19 at 15:51
  • 1
    So, you also suggest to use `d['__name__']`, as in the comments. Why do you need `importlib` at all here? Would `sys.modules[d['__name__']]` not be a better solution? Also, I must have `np.__dict__ is d`. That is my question. With `importlib`, if the module is not in `sys.modules`, it maybe would import it again, which is then not what I want. I never want to import it again. I want to get the existing instance, if there is one. Such that `mod.__dict__ is d`, if such a `mod` exists. – Albert May 17 '19 at 06:45
  • I think the question is very exact. I even provided an example implementation. I want exactly such a function `get_mod_from_dict`. Exactly like in my question. `d` there is a dict. I must have that `mod.__dict__ is d`. I don't quite understand what is not clear about that. (Btw, the reverse function is trivial: `get_dict_from_mod(mod): return mod.__dict__`. So, try that with any module.) – Albert May 17 '19 at 07:08
  • When I say, I must have that `mod.__dict__ is d`, that means, after `mod = get_mod_from_dict(d)`, the following assert will never fail: `assert mod is None or mod.__dict__ is d` – Albert May 17 '19 at 07:10
  • @Albert you should edit the additional details, such as the example implementation, into the question itself. – Paritosh Singh May 17 '19 at 07:24
  • @ParitoshSingh: But the example implementation is already there, since the very beginning? – Albert May 17 '19 at 07:45
  • sorry, i wasn't clear. you added quite a lot of things in the comments for clarity, including the "Reverse function". All those belong in the question, including how you do not wish to import anything again. @Albert Also, you have your answer already, clearly you've pieced it together from this answer. I do not quite understand why you want someone else to give you the answer now, why don't you just self answer if you're not satisfied completely? key lookup on sys.modules seems to solve your goals just fine. – Paritosh Singh May 17 '19 at 07:47
  • @ParitoshSingh This is getting a bit meta now :P I thought it's nicer to not self answer on SO, if others are giving the answer (partly, or via comments). Here in this comments I was trying to point out why I think `importlib` is not a good solution for my question. Also the reverse function was already in the question btw from the very beginning (`vars(mod)`). I still don't see what is missing in the question, or what is unclear. – Albert May 17 '19 at 07:56
  • Also, btw, yes, `sys.modules[d['__name__']]` so far seems the best solution. But I don't really know if this always works correctly (this was mentioned in the comment already). So that is why I hoped someone could write an answer providing details about that (i.e. pointing out any shortcomings). I guess that is why it's also only a comment and not a full answer. – Albert May 17 '19 at 07:59
  • Thanks anyway, this was still very helpful! And I guess it also helped to provide the final answer. – Albert May 17 '19 at 08:05
1

For completeness, another solution, via the gc module:

def get_mod_from_dict_3(d):
  """
  :param dict[str] d:
  :rtype: types.ModuleType|None
  """
  objects = gc.get_referrers(d)
  for obj in objects:
    if isinstance(obj, types.ModuleType) and vars(obj) is d:
      return obj
  return None

Using gc might be Python interpreter dependent, though. Not all Python interpreters might have a GC. And even if they have, I'm not sure it is guaranteed that the module has a reference to its dict (although, very likely, it does; it cannot really think of a good reason why it would not have).

So, I think the other solution via sys.modules[d['__name__']] is probably better.

Although I checked CPython and PyPy, and in both cases, this solution works. And also, this solution is more generic. It works (without the check for ModuleType) even for any arbitrary object.

Despite, thinking about different Python interpreters, I could imagine even a Python interpreter where vars(mod) will never return the same dict, where this will create a dict on-the-fly. Then such a function cannot be implemented at all. Not sure.

I collected all the given solutions, and some testing code, here.

Albert
  • 65,406
  • 61
  • 242
  • 386
0

You could eventually improve a little your solution by using a generator:

def get_mod_from_dict_2(d):
    return next((mod for modname, mod in sys.modules.items() if mod and modname != "__main__" and id(mod.__dict__) == id(d)), None)

But that won't help you to avoid the use of sys.modules...

Update: As said in the answer of @Devesh Kumar Singh, you could use the importlib module to retrieve an already imported module by name (or import it if it haven't been already). The module's dictionary holds the module's name and file as long as it is not the '__main__' module. From there, you can do:

import importlib
import some_module

d = vars(some_module)
print(d['__name__']) # >> 'some_module'

m = importlib.import_module(d['__name__'])
print(m)   # >> <module 'some_module' from '/path/to/some_module.py'>
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
olinox14
  • 6,177
  • 2
  • 22
  • 39
  • How does that improve over my solution? – Albert May 16 '19 at 15:57
  • Not much, but it avoids storing the whole dict for getting only one var. Plus, the iteration stops whenever the var is found. – olinox14 May 17 '19 at 07:39
  • I can just look up `sys.modules` to get an already imported module by name. `importlib` has the disadvantage that it might import something new, which is not what I want. About your solution, I think writing an explicit `for` loop would be more readable, but maybe that's a matter of taste. But this does not really answer my question, because I was asking if there is a better way which does not need to iterate through `sys.modules`. (It's not so relevant how you iterate through it, whether you create this extra dict, etc.) – Albert May 17 '19 at 07:51
  • Ok about the importlib, but I disagree on the subject of the iteration. The use of `next` allow to stop the iteration whenever the condition is fullfilled and return the value. By instanciating a dict, then getting the value, you read the whole `sys.modules` dictionnary once, then get the value from this. That's not much, but the `next` solution is still a little more performant. – olinox14 May 17 '19 at 08:10
  • 1
    I meant relevant w.r.t. the question. Of course you are right. Still, I think writing this via a `for` loop is more readable. – Albert May 17 '19 at 08:12