0

In file ai2_kit/domain.py

def fun(ctx):
    def in_add(a, b):
        print (a+b)

    ctx.executor.run_python_fn(in_add)(1, 2)   # this pass
    ctx.executor.run_python_fn(out_add)(1, 2)  # this failed, the error is: ModuleNotFoundError: No module named 'ai2_kit'

def out_add(a, b):
    print(a+b)

the method run_python_fn is defined in ai2_kit/executor.py , the basic idea is use python -c to execute a python script on remote machine.

    def run_python_script(self, script: str):
        return self.connector.run('python -c {}'.format(shlex.quote(script)))

    def run_python_fn(self, fn: T, python_cmd=None) -> T:
        def remote_fn(*args, **kwargs):
            dumped_fn = base64.b64encode(cloudpickle.dumps(lambda: fn(*args, **kwargs), protocol=pickle.DEFAULT_PROTOCOL)) 
            script = '''import base64,pickle; pickle.loads(base64.b64decode({}))()'''.format(repr(dumped_fn))
            self.run_python_script(script=script, python_cmd=python_cmd)

I have no idea why it will import ai2_kit when use a function outside of current function, the method out_add doesn't have any external dependencies. Is there any method to workaround this problem? Thank you! Both local and remote python is v3.9.

link89
  • 1,064
  • 10
  • 13
  • `cloudpickle`, by default, only stores actual functions (rather than merely recording their module name and reimporting, as `pickle` does) in cases where it thinks this extra effort is actually needed: interactively-defined functions, and functions defined in the `__main__` module. Your imported function looks like something that the normal `pickle` strategy would work with, so that's what gets used. You can call `cloudpickle.register_pickle_by_value(module)` to override this behavior. – jasonharper Jan 31 '23 at 04:51
  • I didn't get the idea. What's the different between `in_add` and `out_add`? They are in the same file and the only difference is it is one defined in another method (closure) and the other define in module. – link89 Jan 31 '23 at 05:01
  • 1
    `in_add` is something that normal `pickle` couldn't handle, due to not having a globally-defined name; `cloudpickle` recognizes that fact, and encodes the actual function definition instead of just the name. `out_add` is something that normal `pickle` could handle just fine (assuming that the same module is available to the program doing the unpickling), so `cloudpickle` doesn't think it needs to do anything special with it. – jasonharper Jan 31 '23 at 05:10
  • @jasonharper I see. Thank you for the explaination. Is it possible to tell `cloudpickle` to encode `out_add` the way it do to `in_add` ? For example, use decorators? – link89 Jan 31 '23 at 05:18
  • I answered that in my first comment. It doesn't appear possible to override the behavior for a specific function, only for an entire module. – jasonharper Jan 31 '23 at 05:21

1 Answers1

0

I figure out a solution according to @jasonharper's explanation:

in_add is something that normal pickle couldn't handle, due to not having a globally-defined name; cloudpickle recognizes that fact, and encodes the actual function definition instead of just the name. out_add is something that normal pickle could handle just fine (assuming that the same module is available to the program doing the unpickling), so cloudpickle doesn't think it needs to do anything special with it.

Just change the definition of out_add to

def __export_remote_functions():

  def add(a, b):
    return a + b

  def prod(a, b):
    return a * b
  return (add, prod)

(add, prod) = __export_remote_functions()

executor.run_python_fn(add)(1, 2)

And now everything is fine.

For functions that gonna to be executed remotely should be defined this way, and you need to ensure they must not depends on any functions or classes defined in the local module directly (global modules are OK, just ensure they are also installed in the remote python).

A good practice is to create a dedicated remote package under your project and ensure all methods and classes in the package are defined by this special way, and don't import anything outside the remote package.

link89
  • 1,064
  • 10
  • 13