0

I only recently started studying python and there was a need to create a phylogenetic tree, however, due to its large size and GIL, it was extremely long.

I tried to make my own trial class. When trying to run a class method on multiple processes, the result was, however, the class method was actually executed twice.

import time
import multiprocessing

class Test:

    def sum_data(self):

        total = 0

        for i in range(1, 10000):
            for j in range(1, 10000):
                total += i + j

        print(f"The result is {total}")


def inter_funct():
    object = Test()
    object.sum_data()

if __name__ == '__main__':
    starttime = time.time()
    processes = []
    for i in range(0, 2):
        p = multiprocessing.Process(target=inter_funct)
        processes.append(p)
        p.start()

    for process in processes:
        process.join()

    print('That took {} seconds'.format(time.time() - starttime))

Is it possible to combine the efforts of 2 processes on 1 object, rather than creating 2 processes and 2 objects?

Nock
  • 1

1 Answers1

0

Generally, objects aren't shared between processes. Each process has its own virtual memory and can't access that of the other process.

If you want to exchange data between processes, you need to use the multiprocessing classes for data exchange. I'll be using a Queue in the code below.

But: sharing the object isn't the problem here. The problem is that the work is done twice. And that won't change, even if you only had one object.

You need to design for parallelization. Each of the two processes shall only do half of the work and in the end you want to combine both to a single result.

In order to do that, you need to be able to pass arguments to the sum_data method, so that each process can do a different half of the work.

import time
import multiprocessing

class Test:
    def sum_data(self, begin, end):    # allow for partial computation
        total = 0
        for i in range(begin, end):
            for j in range(1, 10000):
                total += i + j
        return total                   # don't print here, because it's half


def inter_funct(begin, end, q):
    object = Test()
    q.put(object.sum_data(begin, end))   # Communicate the result back to main


if __name__ == '__main__':
    starttime = time.time()
    q = multiprocessing.Queue()
    processes = []
    p = multiprocessing.Process(target=inter_funct, args=(1,5001, q))
    processes.append(p)
    p.start()
    p = multiprocessing.Process(target=inter_funct, args=(5001,10000, q))
    processes.append(p)
    p.start()

    total = q.get() + q.get()                # Combine the parts
    print("The total is " + str(total))

    for process in processes:
        process.join()

    print('That took {} seconds'.format(time.time() - starttime))

Make sure you empty the queues before joining the processes or your program might end up in a deadlock.

Thomas Weller
  • 55,411
  • 20
  • 125
  • 222
  • Thanks! But I would like to clarify more then. That is, if I create an object that contains, say, a phylogenetic tree calculator, then I will not be able to do several processes for executing this object After all, the process of counting the tree itself is integral and indivisible, since the leaves (end nodes) are compared with each other. Right? – Nock May 20 '23 at 10:18
  • You really should be executing `total = q.get() + q.get()` *before* joining the child processes. You might get away with not doing it here since each child process is only putting one item on the queue and therefore it is not likely that it is blocked waiting for the main process to get the items. But *in general* child processes that have issued `put` calls to a queue cannot terminate (and therefore be joined) until some other thread has retrieved these items. – Booboo May 20 '23 at 13:16
  • @Nock: if the tree has 2 branches at the root, one process could count the left side, the other process could count the right side and the main method adds both branches plus the root node. – Thomas Weller May 20 '23 at 15:21