13

Now I'm developing C# app running on Windows. Some of processes are written in Python, that called via pythonnet (Python for .NET). The processes are calculation-heavy, so I want to do them in parallel.

They are CPU-bounded and can be handled independently.

As far as I know, there are 2 possible ways to realize it:

  1. Launch multiple Python runtime
    The first way is launching multiple Python interpreters but it seems unfeasible. Because pythonnet aparently can manage only one interpreter that initialialized by static method, PythonEngine.Initialize().
    From the Python.NET documentation:

    Important Note for embedders: Python is not free-threaded and uses a global interpreter lock to allow multi-threaded applications to interact safely with the Python interpreter. Much more information about this is available in the Python C-API documentation on the www.python.org Website.
    When embedding Python in a managed application, you have to manage the GIL in just the same way you would when embedding Python in a C or C++ application.
    Before interacting with any of the objects or APIs provided by the Python.Runtime namespace, calling code must have acquired the Python global interpreter lock by calling the PythonEngine.AcquireLock method. The only exception to this rule is the PythonEngine.Initialize method, which may be called at startup without having acquired the GIL.

  2. Use multiprocessing package in Python
    The other way is using multiprocessing package. According to Python documentation, following statement is necessary if the code runs on Windows to ensure spawn finite process:
    if __name__ == "__main__":
    However, the function written in Python is taken as a part of module since it's embedded to .NET.
    For example, following code is executable, but spawns processes infinitely.

//C#
static void Main(string[] args)
    {
        using (Py.GIL())
        {
            PythonEngine.Exec(
                "print(__name__)\n" + //output is "buitlins"
                "if __name__ == 'builtins':\n" +
                "   import test_package\n" +  //import Python code below
                "   test_package.async_test()\n"
                );
        }
    }
# Python
import concurrent.futures

def heavy_calc(x):
    for i in range(int(1e7) * x):
        i*2

def async_test():
    # multiprocessing
    with concurrent.futures.ProcessPoolExecutor(max_workers=8) as executor:
        futures = [executor.submit(heavy_calc,x) for x in range(10)]
        (done, notdone) = concurrent.futures.wait(futures)
        for future in futures:
            print(future.result())

Is there good idea to solve above problem? Any comments would be appreciated. Thanks in advance.

sfb
  • 133
  • 5
  • What about using ThreadPoolExecutor instead of ProcessPool one? – Evk Dec 29 '17 at 06:07
  • Just to have it mentioned atleast one time - is it a possibility to transalte your pyhton stuff into c# and be happy with easy to use Threadpools etc? :) " spawns processes infinitely." do these processes atleast do their jobs? Do they get sh** done? If yes, your only problem is that endless spawn of processes. I'm asking, because i did not fully unterstand what is going wrong (besides infinite process spawn) – Tobias Theel Dec 29 '17 at 14:17
  • @Evk As I mentioned in the post, the calculation process is CPU-bounded, so ProcessPool is suitable because ThreadPoolExecuter cannot make it faster due to Python's GIL. Please see https://stackoverflow.com/questions/1226584/multiprocess-or-threading-in-python – sfb Dec 30 '17 at 06:54
  • 1
    @TobiasTheel The Python stuff includes a lot of advanced calculation such as machine learning, so translation may take long time. That's true, my only problem is "endless spawn of processes". If run the code in the post, perhaps do their jobs at least once, but spawns processes infinitely, so my Windows froze. According to [Python documentation](https://docs.python.org/3/library/multiprocessing.html#multiprocessing-programming), we have to protect "entry point" by if __name__ == "__main__":. I want to know how to do that in embedding application. – sfb Dec 30 '17 at 07:23
  • I don't fully understand. How do the two code blocks you posted go together? Does the upper one call the lower one? If so: why do you make the main executable in C# and not in python and import pythondotnet and call your C# code from within python? – hansaplast Dec 30 '17 at 21:08
  • @hansaplast Sorry for the confusion. C# calls Python. The C# code above is console app, but actually, What I'm developing is GUI such as Form app, that contains heavy calculation written in Python. – sfb Dec 31 '17 at 03:27
  • Why not have a console script from python, and you just spawn multiple command line processes from the C# app? Stop thinking of launching python from C#, and instead just running commands from C#. Setup a process pool and just execute `python my_script.py arg1 arg2 arg3...` as the command. If you want to get fancy, using `click` from python and install the script as a console script target. http://python-packaging.readthedocs.io/en/latest/command-line-scripts.html#the-console-scripts-entry-point – Brian Pendleton Jan 02 '18 at 16:26
  • @BrianPendleton The size of arguments I want to pass to Python is relatively large since they are kind of big data. So if use command line, we have to pass the data through file or database. From the point of view of access speed, I think that approach is not preferable. But it deserves to consider if there's no way to pass them through memory. – sfb Jan 05 '18 at 03:14

1 Answers1

3

For each python call, 1. Create an appDomain 2. Create a task in the appdomain that will run the python asynchronously.

Since it's separate AppDomains, the static methods will be independent.

Creating an using an AppDomain is heavy, so I couldn't do it if the number of calls you have is extremely large, but it sounds like you just might have a small number of processes to run asynchronously.

Ctznkane525
  • 7,297
  • 3
  • 16
  • 40
  • I didn't know appDomain until now, thanks. It seems to work, but I'm not sure how many processes corresponds to extremely large (of course, I know that it depends on many factors). I'll check whether it goes well. – sfb Jan 05 '18 at 07:41
  • Sure. It will take some time to try it, so please wait for a while. The usage of appDomain looks like different from the intended one but it's worth doing. – sfb Jan 05 '18 at 16:24
  • @sfb Have you been able to solve your issue with the use of separate AppDomain(s)? I am actually facing the exact same issue from my WPF app. – Amaury Levé Jun 30 '21 at 14:48