Python multiprocessing threads never join when given large amounts of work

Question

I don't believe this is a duplicate of this, because his problem appeared to have been caused by using multiprocessing.pool, which I am not doing.

This program:

import multiprocessing
import time

def task_a(procrange,result):
    "Naively identify prime numbers in an iterator of integers. Procrange may not contain negative numbers, 0, or 1. Result should be a multiprocessing.queue."

    for i in procrange: #For every number in our given iterator...
        for t in range (2,(i//2)+1): #Take every number up to half of it...
            if (i % t == 0): #And see if that number goes evenly into it.
                break   #If it does, it ain't prime.
        else:
            #print(i)
            result.put(i) #If the loop never broke, it's prime.




if __name__ == '__main__':
    #We seem to get the best times with 4 processes, which makes some sense since my machine has 4 cores (apparently hyperthreading doesn't do shit)
    #Time taken more or less halves for every process up to 4, then very slowly climbs back up again as overhead eclipses the benifit from concurrency
    processcount=4
    procs=[]
    #Will search up to this number.
    searchto=11000
    step=searchto//processcount
    results=multiprocessing.Queue(searchto)
    for t in range(processcount):
        procrange=range(step * t, step * (t+1) )
        print("Process",t,"will search from",step*t,"to",step*(t+1))
        procs.append(
                     multiprocessing.Process(target=task_a, name="Thread "+str(t),args=(procrange,results))
                     )
    starttime=time.time()
    for theproc in procs:
        theproc.start()
    print("Processing has begun.")

    for theproc in procs:
        theproc.join()
        print(theproc.name,"has terminated and joined.")
    print("Processing finished!")
    timetook=time.time()-starttime

    print("Compiling results...")

    resultlist=[]
    try:
        while True:
            resultlist.append(results.get(False))
    except multiprocessing.queues.Empty:
        pass

    print(resultlist)
    print("Took",timetook,"seconds to find",len(resultlist),"primes from 0 to",searchto,"with",processcount,"concurrent executions.")

... works perfectly, giving the result:

Process 0 will search from 0 to 2750
Process 1 will search from 2750 to 5500
Process 2 will search from 5500 to 8250
Process 3 will search from 8250 to 11000
Processing has begun.
Thread 0 has terminated and joined.
Thread 1 has terminated and joined.
Thread 2 has terminated and joined.
Thread 3 has terminated and joined.
Processing finished!
Compiling results...
[Many Primes]
Took 0.3321540355682373 seconds to find 1337** primes from 0 to 11000 with 4 concurrent executions.

However, if search_to is increased by even 500...

Processing has begun.
Thread 0 has terminated and joined.
Thread 1 has terminated and joined.
Thread 2 has terminated and joined.

... and the rest is silence. Process Hacker shows the Python threads consuming 12% CPU each, petering out one by one... and not terminating. They just hang until I terminate them manually.

Why?

** Clearly, either God or Guido has a cruel sense of humor.

Could not replicate on 2.7.5. Tested up to 30,000 successfully. — Michael Boselowitz, Apr 22 '14 at 02:00
Doesn't work for me on my 2.7.3 installation. What OS did you use? I'm on Windows. — Schilcote, Apr 22 '14 at 02:47
I had a very similar problem, see my solution : http://stackoverflow.com/questions/28807023/why-does-my-multiprocess-python-script-never-end/28900905#28900905 — philnext, Mar 06 '15 at 14:31

NorthCat · Answer 1 · 2014-04-22T11:56:12.320

1

It seems that a problem is in result.put(i), because when I commited it, the script began to work well. So I suggest you do not use to save the results multiprocessing.Queue. Instead, you can use the database: MySQL, MongoDB etc. Note: you cannot use SQLite, because with SQLite only one process can be making changes to the database at any moment in time (from docs).

edited Apr 22 '14 at 11:56

answered Apr 22 '14 at 11:04

NorthCat

9,643
16
47
50

Python multiprocessing threads never join when given large amounts of work

1 Answers1

Linked