3

I need to run a Python script in 32 bit system wide to generate / collect data through a third-party application. But I want to process the data using GPU via numba, so it has to be run in 64 bit Python environment.

I have set up a 64 bit Python virtualenv, and testes some simple numba code that ran fine in there. So how shall I write the code in the parent process that will invoke a child process (multiprocessing or subprocess I assume) that will switch to 64 bit virtualenv and does calculation using numba? More specifically:

  1. shall I use multiprocessing or subprocess to implement the parent (32 bit Python) and child process (64 bit Python) mechanism?
  2. how shall I pass large amount of data between the parent and child processes?

possible code sample:

def func_32():
    # data collection
    # using 3rd party API
    return data

def func_64(data, output):
    # switch to 64 bit virtual env
    # using virtualenvwrapper-win
    os.system('workon env64bit')
    # numba data process
    # results stored in output
    return None

def main():
    data = func_32()
    # I think I only need one process since it will be in GPU not CPU
    p = multiprocessing.Process(target=func_64, args=(data, output))
    p.start()
    return output

anything I'm missing in the sample code?

graffaner
  • 145
  • 7
  • 1
    Generally speaking, a 32-bit Python process could launch a 64-bit Python process by explicitly passing the path to the 64-bit Python interpreter executable to `subprocess()` (and passing the name of script file to run as an argument). I don't think `multiprocessing.Process` would work in such a mixed environment since there's no way to specify an interpreter path to it. I don't know how using `virtualenv` would affect things, but it seems like it might work. You can get the path to the 64-bit interpreter from `sys.executable` (when running with that version). – martineau Oct 21 '18 at 20:09
  • @martineau I have tried `subprocess()` but had hard time finding a good way to pass data from `func_32()` to `func_64()`. Seems like data have to go through `stdin` and `stdout`. I ended up using `multiprocessing`, after I found there is a `multiprocessing.set_executable(r"C:\Python64\Python.exe")` – graffaner Oct 24 '18 at 14:12
  • Perhaps you can used named pipes as shown in [this answer](https://stackoverflow.com/a/28840955/355230) to a question about them. – martineau Oct 24 '18 at 14:28

1 Answers1

0

I saw this question Spawn multiprocessing.Process under different python executable with own path, and found my answer with a little twist given my Python version (2.7.5 32-bit, 2.7.15 64-bit).

def func_32():
    # data collection
    # using 3rd party API
    return data

def func_64(data, output):
    # switch to 64 bit Python
    # not directly calling virtualenv
    # to switch env
    import sys
    print sys.executable
    # will print  C:\Python64\Python.exe
    # numba data process
    # results stored in output
    return None

def main():
    data = func_32()
    multiprocessing.set_executable(r'C:\Python64\python.exe')
    p = multiprocessing.Process(target=func_64, args=(data, output))
    p.start()
    p.join()
    return output

But in order to use the 64-bit Python packages in that virtual environment, I ended up copying the code mostly from activate_this.py (residing in the virtualenv folder), to change the Python search path etc. See my answer in Spawn multiprocessing.Process under different python executable with own path.

I think to me using multiprocessing gives me more convenient way of passing data between parent and child processes, especially large amount of data.

graffaner
  • 145
  • 7