4

I had a code that was running successfully, but takes too long to run. So I decided to try to parallelize it.

Here is a simplified version of the code:

import multiprocessing as mp
import os
import time


output = mp.Queue()

def calcSum(Nstart,Nstop,output):
    pid = os.getpid()

    for s in range(Nstart, Nstop):
        file_name = 'model' + str(s) + '.pdb'

        file = 'modelMap' + str(pid) + '.dat'

        #does something with the contents of the pdb file
        #creates another file by using some other library:
        someVar.someFunc(file_name=file)

        #uses a function to read the file
        density += readFile(file)

        os.remove(file)

        print pid,s

    output.put(density)

if __name__ == '__main__':
    snapshots = int(sys.argv[1])
    cpuNum = int(sys.argv[2])

    rangeSet = np.zeros((cpuNum)) + snapshots//cpuNum
    for i in range(snapshots%cpuNum):
        rangeSet[i] +=1

    processes = []
    for c in range(cpuNum):
        na,nb = (np.sum(rangeSet[:c])+1, np.sum(rangeSet[:c+1]))
        processes.append(mp.Process(target=calcSum,args=(int(na),int(nb),output)))

    for p in processes:
        p.start()

    print 'now i''m here' 

    results = [output.get() for p in processes]

    print 'now i''m there' 

    for p in processes:
        p.join()

    print 'think i''l stay around'
    t1 =time.time()
    print len(results)
    print (t1-t0)

I run this code with the command python run.py 10 4.

This code prints the pid and s successfully in the outer loop in calcSum. I can also see that two CPUs are at 100% in the terminal. What happens is that finally pid 5 and pid 10 are printed, then the CPU usage drops to zero, and nothing happens. None of the following print statements work, and the script still looks like it's running in the terminal. I'm guessing that the processes are not exited. Is that the case? How can I fix it?

Here's the complete output:

$ python run.py 10 4
now im here
9600
9601
9602
9603
9602 7
9603 9
9601 4
9600 1
now im there
9602 8
9600 2
9601 5
9603 10
9600 3
9601 6

At that point I have to stop termination with Ctrl+C.

A few other notes:

  • if I comment os.remove(file) out, I can see the created files in the directory
  • unfortunately, I cannot bypass the part in which a file is created and then read, within calcSum

EDIT At first it worked to switch output.get() and p.join(), but upon some other edits in the code, this is no longer working. I have updated the code above.

sodiumnitrate
  • 2,899
  • 6
  • 30
  • 49

0 Answers0