0

I want to learn multiprocessing/threading in python I have written a simple code as below. How to speed up this code by multiprocessing/threading.

The code is very simple.

my_numbers = {}
for i in range(0,my_number1):
    my_numbers[i] = zeros(my_number2, dtype=int)

Now I just want to add 1 to each number in the lists:

for i in my_numbers:
    my_numbers[i] += 1

How can I use multiprocessing/threading to speed up the for loop?

p.s. 1: my_numbers = ones(my_number2,dtype=int) is not what I want? I am trying to speed calculation by multiprocessing the for loop.

p.s. 2: I have 12 CPUs and 32GB RAM.

Cœur
  • 37,241
  • 25
  • 195
  • 267
m.i.cosacak
  • 708
  • 7
  • 21
  • `mynumber2` doesn't change inside the loop. Why not calculate `z = zeros(my_number2, dtype=int) + 1` once and then `my_numbers = {i: z for i in range(my_number1)}` – Patrick Haugh Jan 09 '18 at 02:38
  • I am interested in learning multithreading on for loops. I have for loop that take each item and perform some analysis on the list. This is just a simple example for me to understand the logic of multithreading on for loops, – m.i.cosacak Jan 09 '18 at 02:41
  • In that case: https://stackoverflow.com/questions/6832554/python-multiprocessing-how-do-i-share-a-dict-among-multiple-processes – Patrick Haugh Jan 09 '18 at 02:42
  • To understand the logic of multithreading and multiprocessing I suggest you read Python's documentation and online tutorials first. Your example is very simple and there's much more to it then it would surface from this particular problem. Also, choice of modules and performance will depend on your actual problem and a simple problem like this may not be representative. – atru Jan 09 '18 at 02:43
  • I have watched several tutorial and posts. I did not get the logic. That is why I made my question simple so I can understand. Let say I have a fucntion called def add_one(mylist): return [x+1 for x in mylist]. In "for loop", the function will get the list and add one. How to do this in multiprocessing? That is my question. I hope it is clear now. – m.i.cosacak Jan 09 '18 at 02:52

1 Answers1

0

Here is what I wanted to do, but it is slower.

from multiprocessing import Process, Manager

from numpy import *

import time

def f(d,i):
    d[i] += i

if __name__ == '__main__':
    manager = Manager()
    t = time.time()
    d = manager.dict()
    for i in range(100):
        d[i] = array([0,0])
    k = d.keys()
    while len(k) >= 4:  
        p1 = Process(target=f, args=(d,k[0],))
        p2 = Process(target=f, args=(d,k[1],))
        p3 = Process(target=f, args=(d,k[2],))
        p4 = Process(target=f, args=(d,k[3],))
        p1.start()
        p2.start()
        p3.start()
        p4.start()
        p1.join()
        p2.join()
        p3.join()
        p4.join()
        k = k[4:]
    else:
        if len(k) == 4:
            p1 = Process(target=f, args=(d,k[0],))
            p2 = Process(target=f, args=(d,k[1],))
            p3 = Process(target=f, args=(d,k[2],))
            p4 = Process(target=f, args=(d,k[3],))
            p1.start()
            p2.start()
            p3.start()
            p4.start()
            p1.join()
            p2.join()
            p3.join()
        elif len(k) == 3:
            p1 = Process(target=f, args=(d,k[0],))
            p2 = Process(target=f, args=(d,k[1],))
            p3 = Process(target=f, args=(d,k[2],))
            p1.start()
            p2.start()
            p3.start()
            p1.join()
            p2.join()
            p3.join()
        elif len(k) == 2:
            p1 = Process(target=f, args=(d,k[0],))
            p2 = Process(target=f, args=(d,k[1],))
            p1.start()
            p2.start()
            p1.join()
            p2.join()
    if len(k) == 1:
        d[k[0]] += k[0]
    print time.time()-t
    print d

    tt = time.time()
    d = {}
    for i in range(100):
        d[i] = array([0,0])
    for i in d:
        d[i] += i
    print time.time()-tt
    print d

I will appreciate in case anyone can suggest improvement. Thanks.

m.i.cosacak
  • 708
  • 7
  • 21