3

I'm writing my first multiprocessing program in python.

I want to create a list of values to be processed, and 8 processes (number os CPU cores) will consume and process the list of values.

I wrote the following python code:

__author__ = 'Rui Martins'

from multiprocessing import cpu_count, Process, Lock, Value

def proc(lock, number_of_active_processes, valor):
    lock.acquire()
    number_of_active_processes.value+=1
    print "Active processes:", number_of_active_processes.value
    lock.release()
    # DO SOMETHING ...
    for i in range(1, 100):
        valor=valor**2
    # (...)
    lock.acquire()
    number_of_active_processes.value-=1
    lock.release()

if __name__ == '__main__':
    proc_number=cpu_count()
    number_of_active_processes=Value('i', 0)
    lock = Lock()
    values=[11, 24, 13, 40, 15, 26, 27, 8, 19, 10, 11, 12, 13]
    values_processed=0

    processes=[]
    for i in range(proc_number):
        processes+=[Process()]
    while values_processed<len(values):
        while number_of_active_processes.value < proc_number and values_processed<len(values):
            for i in range(proc_number):
                if not processes[i].is_alive() and values_processed<len(values):
                    processes[i] = Process(target=proc, args=(lock, number_of_active_processes, values[values_processed]))
                    values_processed+=1
                    processes[i].start()

            while number_of_active_processes.value == proc_number:
                # BUG: always number_of_active_processes.value == 8 :(
                print "Active processes:", number_of_active_processes.value

    print ""
    print "Active processes at END:", number_of_active_processes.value

And, I have the following problem:

  • The program never stop
  • I get out of RAM enter image description here
Rui Martins
  • 3,337
  • 5
  • 35
  • 40
  • Daniel Sanchez, i think that Multiprocessing is different of threading, and GIL is not locked with Multiprocessing. See: http://stackoverflow.com/questions/3044580/multiprocessing-vs-threading-python?rq=1 – Rui Martins Dec 31 '15 at 16:04
  • yes, I thought about it just after posting that stupidity of mine, sorry :/ – Netwave Dec 31 '15 at 16:15

2 Answers2

1

Simplifying your code to the following:

def proc(lock, number_of_active_processes, valor):
    lock.acquire()
    number_of_active_processes.value += 1
    print("Active processes:", number_of_active_processes.value)
    lock.release()
    # DO SOMETHING ...
    for i in range(1, 100):
        print(valor)
        valor = valor **2
    # (...)
    lock.acquire()
    number_of_active_processes.value -= 1
    lock.release()


if __name__ == '__main__':
    proc_number = cpu_count()
    number_of_active_processes = Value('i', 0)

    lock = Lock()
    values = [11, 24, 13, 40, 15, 26, 27, 8, 19, 10, 11, 12, 13]
    values_processed = 0

    processes = [Process() for _ in range(proc_number)]
    while values_processed < len(values)-1:
        for p in processes:
            if not p.is_alive():
                p = Process(target=proc,
                            args=(lock, number_of_active_processes, values[values_processed]))
                values_processed += 1
                p.start()

If you run it like above the print(valor) added you see exactly what is happening, you are exponentially growing valor to the point you run out of memory, you don't get stuck in the while you get stuck in the for loop.

This is the output at the 12th process adding a print(len(srt(valor))) after a fraction of a second and it just keeps on going:

2
3
6
11
21
.........
59185
70726
68249
73004
77077
83805
93806
92732
90454
104993
118370
136498
131073

Just changing your loop to the following:

for i in range(1, 100):
    print(valor)
    valor = valor *2

The last number created is:

 6021340351084089657109340225536

Using your own code you seem to get stuck in the while but it is valor is growing in the for loop to numbers with as many digits as:

167609
180908
185464
187612
209986
236740
209986

And on....

vabada
  • 1,738
  • 4
  • 29
  • 37
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
  • Hi, thanks, but i think that you cannot change the code, because, your code uses more than 8 processes, because, when you do "p = Process(...", your will create and start a new one process, but does not replace "p" in original list os processes, and "if not p.is_alive()" all time will return "True". EXAMPLE: lista = range(15) print lista for l in lista: print l l=0 print lista – Rui Martins Jan 02 '16 at 17:15
  • Sorry, I dont speak English very well, but, in my code I run 8 Processes only (at same time), and in your code you run 13 (because is the size of "values". The question is that your code is correct and works, but i need that each process be the most fast as possible, then I run only 8 processes, each one with one value, and every time that one process finish, I run a new one until every values be processed. Thanks for the help :) Really you have the best answer – Rui Martins Jan 02 '16 at 19:11
  • 1
    @RuiMartins, `processes = [Process() for _ in range(proc_number)]` and just looping over the processes list is the exact same as the indexing logic in your code, you can use enumerate to index when you want to `processes[i] = Process` http://pastebin.com/ksQbvMET – Padraic Cunningham Jan 02 '16 at 19:25
  • 1
    Ok, now works with the enumerate ;) Thank you for your time :) – Rui Martins Jan 02 '16 at 19:42
0

The problem is not your multiprocessing code. It's the pow operator in the for loop:

for i in range(1, 100):
        valor=valor**2

the final result would be pow(val, 2**100), and this is too big, and calculate it would cost too much time and memory. so you got out of memory error in the last.

4 GB = 4 * pow(2, 10) * pow(2, 10) * pow(2, 20) * 8 bit = 2**35 bit

and for your smallest number 8:

pow(8, 2**100) = pow(2**3, 2**100) = pow(2, 3*pow(2, 100))
pow(2, 3*pow(2, 100))bit/4GB = 3*pow(2, 100-35) = 3*pow(2, 65)

it need 3*pow(2, 65) times of 4 GB memory.

oxnz
  • 835
  • 6
  • 16