As far as I understand, an ipython
cluster manages a set of persistent namespaces (one per engine). As a result, if a module that is imported by an engine engine_i
is modified, killing the main interpreter is not sufficient for that change to be reflected in the namespace of engine_i
.
Here's a toy example that illustrates this:
#main.py
from ipyparallel import Client
from TC import test_class #TC is defined in the next code block
if __name__=="__main__":
cl=Client()
cl[:].execute("import TC")
lv=cl.load_balanced_view()
lv.block=True
tc=test_class()
res=lv.map(tc, [12,45])
print(res)
with the TC
module only consisting of
#TC.py
class test_class:
def __call__(self,y):
return -1
Here, consider the excution
$npcluster start -n <any_number_of_engines> --daemonize
$python3 main.py
[-1, -1]
$#open some editor and modify test_class.__call__ so that it returns -2 instead of -1
$python3 main.py #output is as expected, still [-1, -1] instead of [-2, -2]
[-1, -1]
This is expected as the engines have their own persistent namespaces, and a trivial solution to make sure that changes to TC
are included in the engines is simply to kill (e.g. via $ipcluster stop
) and restart them again before running the script.
However, killing/restarting engines quickly becomes tedious in case you need to frequently modify a module. So, far, I've found a few potential solutions but none of them are really useful:
If the modification is made to a module directly imported to the engine's namespace, like
TC
above:cl[:].execute("from imp import reload; import TC; reload(TC)")
However, this is very limited as it is not recursive (e.g. if
TC.test_class.__call__
itself importsanother_module
and we modifyanother_module
, then this solution wont work).Because of the problem with the previous solution, I tried ipython's
deepreload
in combination with%autoreload
:from IPython import get_ipython ipython=get_ipython() ipython.magic("%reload_ext autoreload") ipython.magic("%autoreload 2") cl[:].execute("import builtins;from IPython.lib import deepreload;builtins.reload=deepreload.reload;import TC;reload(TC)")
This doesn't seem to work at all for reasons that so far I haven't understood.
The magic
%reset
from ipython is supposed to (per the documentation)) clear the namespace, but it didn't work on the engine namespaces including in the toy example given above.I tried to adapt the first answer given here to clean up the engine namespaces. It doesn't seem however to help with re-importing modified modules.
It seems to me that the most reliable solution is therefore to just kill/restart the engines each time. It looks like this can't even be done from the script as cl.shutdown(restart=True)
throws NotImplementedError
. Is everyone working with ipyparallel
constanty restarting their clusters manually or is there something obvious that I'm missing?