1

I'm learning how to use multiprocessing and started with simple tasks:

import multiprocessing as mp
import time

starttime = time.time()

Result_1 = []
Result_2 = []
Result_3 = []

def Calculation_1():
    numbers = list(range(0, 10000000))
    for num in numbers:
        Result_1.append(num ** 0.5)

def Calculation_2():
    numbers = list(range(0, 10000000))
    for num in numbers:
        Result_2.append(num ** 2)

def Calculation_3():
    numbers = list(range(0, 10000000))
    for num in numbers:
        Result_3.append(num ** 3)

if __name__ == "__main__":

    p1 = mp.Process(target = Calculation_1)
    p2 = mp.Process(target = Calculation_2)
    p3 = mp.Process(target = Calculation_3)
    
    p1.start()
    p2.start()
    p3.start()
    
    p1.join()
    p2.join()
    p3.join()
    
    endtime = time.time()
    print("Time =", "{:.2f}".format((endtime - starttime) * (10 ** 3)), "ms")

The goal is to calculate all three functions simultaneously instead of sequentially. However, all my result lists are blank. How do I get this right?

Thank you very much.

vnc89
  • 13
  • 3
  • I have added another answer, since the one you approved does not meet your stated goal of calculating the functions simultaneously. I'm running Python3.10 and Windows 10. Maybe on a different OS the results will be different. I would be interested if you or anyone else has information about that. – Paul Cornelius Sep 30 '22 at 01:36

2 Answers2

0

Use the threading.Thread method it will work fine.

from threading import Thread
import time

starttime = time.time()

Result_1 = []
Result_2 = []
Result_3 = []

def Calculation_1():
    numbers = list(range(0, 10000000))
    for num in numbers:
        Result_1.append(num ** 0.5)

def Calculation_2():
    numbers = list(range(0, 10000000))
    for num in numbers:
        Result_2.append(num ** 2)

def Calculation_3():
    numbers = list(range(0, 10000000))
    for num in numbers:
        Result_3.append(num ** 3)

if __name__ == "__main__":

    p1 = Thread(target = Calculation_1)
    p2 = Thread(target = Calculation_2)
    p3 = Thread(target = Calculation_3)
    
    p1.start()
    p2.start()
    p3.start()
    
    p1.join()
    p2.join()
    p3.join()
    
    endtime = time.time()
    print("Time =", "{:.2f}".format((endtime - starttime) * (10 ** 3)), "ms")


codester_09
  • 5,622
  • 2
  • 5
  • 27
  • Thank you very much, it works now. May I ask also how threading differs from multiprocessing in this case? – vnc89 Sep 29 '22 at 09:03
  • @vnc89 You can find this question answer here: [Multiprocessing vs Threading Python](https://stackoverflow.com/questions/3044580/multiprocessing-vs-threading-python) – codester_09 Sep 29 '22 at 09:27
  • Thank you, that clarifies a lot. One last thing I'm still stuck with is why the above code for multiprocessing does not work whereas it does with threading. – vnc89 Sep 29 '22 at 10:01
  • ok, I'll study more about this. Thank you very much for your help. – vnc89 Sep 29 '22 at 10:43
0

Your program doesn't work because each Process in Python occupies its own memory space. You have 4 Processes: the main one, and three other ones that you create by calling Process() 3 times. All four of them have their own set of global variables, which means that all 4 Processes have global variables named "Result_1", "Result_2" and "Result_3". So each Process works with its own version of these objects, and they are not the same object. This is far from obvious when you just read the source code, and it definitely takes a while to wrap your head around this concept.

When Process p1 modifies Result_1, it modifies its own instance of that list. It's a different object than the one used by your main Process, even though at the source code level they both have the same name. When you look at the contents of Result_1 in your main Process, it is empty. That's because your main Process doesn't know what Process p1 did. Sharing data between Processes is not a trivial problem. The Python standard library has some tools for this, but they must be used carefully.

Multithreading is different. Threads share a memory space, so the solution presented by codester_09 works. There is only one list named Result_1. When the secondary thread modifies it, the main thread can access the modified data immediately. No problem. However, his solution does not accomplish your stated goal of calculating all three functions simultaneously. With threading, Python creates the illusion of multitasking by switching rapidly from one thread to another. You can easily verify this by adding the following 5 lines to codester_09's listing:

t0 = time.time()
Calculation_1()
Calculation_2()
Calculation_3()
print(time.time() - t0)

This will run the three calculations sequentially, one after the other, and takes just as long as the threaded version on my machine (Win10).

The following program utilizes shared memory arrays, part of the multiprocessing module. Three such arrays are created and passed to the secondary Processes. I inserted a print statement to prove that the arrays are updated.

The program's overall execution time is less than half of the sequential version, which is your stated goal. The execution speedup is not three times, as you might expect, due to some system-level complexities of using shared memory (I think). But it's distinctly faster than the threaded version and it works.

import multiprocessing as mp
import time

LENGTH = 10000000

def Calculation_1(x):
    for n in range(LENGTH):
        x[n] = n ** 0.5
    print("C1 finished")

def Calculation_2(x):
    for n in range(LENGTH):
        x[n] = n ** 2
    print("C2 finished")

def Calculation_3(x):
    for n in range(LENGTH):
        x[n] = n ** 3
    print("C3 finished")
    
def main():
    starttime = time.time()
    x1 = mp.Array("d", LENGTH, lock=False)
    x2 = mp.Array("d", LENGTH, lock=False)
    x3 = mp.Array("d", LENGTH, lock=False)

    p1 = mp.Process(target = Calculation_1, args=(x1,))
    p2 = mp.Process(target = Calculation_2, args=(x2,))
    p3 = mp.Process(target = Calculation_3, args=(x3,))

    p1.start()
    p2.start()
    p3.start()

    p1.join()
    p2.join()
    p3.join()

    for x in (x1, x2, x3):
        print(x[1], x[-1], len(x))

    endtime = time.time()
    print("Time =", "{:.2f}".format((endtime - starttime) * (10 ** 3)), "ms")

if __name__ == "__main__":
    main()
Paul Cornelius
  • 9,245
  • 1
  • 15
  • 24
  • After reading codester_09's and your answer, I now have a better understanding regarding multiprocessing and threading. The key part seems to be sharing information between processes. Thank you so much for your help. – vnc89 Sep 30 '22 at 03:22