How do I tell if Python's Multiprocessing module is using all of my cores for calculations?

Question

I have some simple code from a tutorial like this:

from multiprocessing import Process, Lock
import os

def f(i):
    print 'hello world', i
    print 'parent process:', os.getppid()
    print 'process id:', os.getpid(), "\n\n"

if __name__ == '__main__':
    lock = Lock()

    for num in range(10):
        p = Process(target=f, args=(num,))
        p.start()
    p.join()

How can I tell if this is utilising both of my cores? Currently I'm running Ubuntu 11.04 w/ 3 GB RAM and Intel Core 2 Duo @ 2.2GHz.

The project I'm learning this for is going to be moved to a huge machine in somebody's office, with much more horsepower than I currently have at my disposal. Specifically, the processor will have at least 4 cores, and I want to be sure to get my algorithm to automatically detect and utilise all available cores. Also, that system will potentially be something other than Linux, so are there any common pratfalls that I have to watch for when moving the Multiprocessing module between OS's?

Oh yeah, also, the output of the script looks something like this:

hello world 0
parent process: 29362
process id: 29363 


hello world 1
parent process: 29362
process id: 29364 


hello world 2
parent process: 29362
process id: 29365 

and so on...

So from what I know so far, the PPIDs are all the same because the script above when run is the parent process which calls the children processes, which are each a different process. So does the multiprocessing automatically detect and handle multiple cores, or do I have to tell it where to look? Also, from what I read while searching for a copy of this question, I shouldn't be spawning more processes than there are cores because it eats up the system resources that would otherwise be used for computations.

Thanks in advance for your help, my thesis loves you.

score 3 · Answer 1 · answered Apr 05 '12 at 23:33

Here's a handy little command I use to monitor my cores from the command line:

watch -d "mpstat -P ALL 1 1 | head -n 12"

Note that the mpstat command must be available on your system, which you can get on Ubuntu by installing the sysstat package.

sudo apt-get install sysstat

If you want to detect the number of available cores from Python, you can do so using the multiprocessing.cpu_count() function. On Intel CPUs with Hyper-Threading, this number will be double the actual number of cores. Launching as many processes as you have available cores will usually scale to fully occupy all cores on your machine, as long as the processes have enough work to do and don't get bogged down with communication. Linux's process scheduler will take it from there.

Cool, thanks. It'll be extremely useful to be able to be able to spawn exactly as many processes as I need. — user1173922, Apr 05 '12 at 23:44

score 1 · Answer 2 · answered Apr 05 '12 at 23:29

1

A few things about your code sample. You currently aren't using your lock, even though you create one. And, you are only joining on the last process you started. Right now they probably end so quickly that you won't see an issue, but if any of those earlier processes took longer than the last one, you might terminate before they are done I think.

Regarding making sure each process ends up on a different core. Unfortunately you can't. That is a decision that the scheduler of the operating system will make. You are simply writing code that uses multiple processes to allow the system to schedule them in parallel. Some may be on the same core.

Pitfalls (pratfalls?), might be that your actual code doesn't really require multiple processes and instead could benefit much better from threading. Also, you have to be very careful with how you share memory in multiprocessing. There is a lot more overhead involved with interprocess communication vs inter-thread. So its usually reserved for a case when threading simply will not get you what you need.

answered Apr 05 '12 at 23:29

jdi

90,542
19
167
203

Hmmm... received wisdom in the Python world is generally that multiprocessing is to be preferred to threading (at least for "real" stuff less trivial than the OP's code) due to the GIL lock. (Although on looking for something to cite as evidence, and finding http://stackoverflow.com/questions/1289813/python-multiprocessing-vs-threading-for-cpu-bound-work-on-windows-and-linux , I see it's not quite that simple!) – timday Apr 05 '12 at 23:42
1

@timday: I believe multiprocessing is preferred if your application truly is heavily CPU bound, but like you found out, it also varies between platforms. Also, my personal opinion is that some newer python programmers want to go right for the multiprocessing module because its the buzzword, and never really attempt to use threading. Multiprocessing isn't a generally preferred approach over threading in every case. Its only applicable... when applicable. – jdi Apr 05 '12 at 23:47
Oh yeah, the lock is there because I was editing the code to see what the difference between locking (which the origional example did) and joining (which I put in there). I can see that I'll need some more work on this (and hopefully some advice), but that's the reasoning behind why that's there. About the joining that you mentioned in P.1, will it cause trouble if I indent join by one tab so it's in line with the process.start()? Will join automatically not be called until the process is done? And finally, about multiprocessing vs. threading: I started reading the threading tuts, but – user1173922 Apr 05 '12 at 23:53
they said that most always in Python users should be using Multiprocessing. I don't really understand when one vs. the other is applicable. Any hints? – user1173922 Apr 05 '12 at 23:54
@user1173922: The locking from your example was initially there to protect the processes from writing to stdout at the same time, not actually to keep your app running until they are all done. If you were to indent that join, it would block once after each process starts, until it finishes, meaning they would run one after the next, not parallel. Like I was saying above, mp is beneficial when your functions are heavy on cpu instead of IO, meaning they are constantly crunching numbers and never wait on the filesystem, network, etc. – jdi Apr 06 '12 at 00:02
OK, cool, what I have is almost certainly spend almost all of it's time crunching numbers, and only a little time at the end of the algorithm writing log files. So the difference between locking and joining is that locking is only really useful if you're printing stuff to stdout, and joining is necessary if you desire to wait on continuing the algorithm until all of the children processes have finished, correct? – user1173922 Apr 06 '12 at 00:18
Oh, from another tutorial I'm reading it appears that there is a need to call .join() on each process which is spawned. Is this accurate? So calling .join() on an individual process only waits for that process? Does that mean my code will mess up because I'm naming each process 'p'? – user1173922 Apr 06 '12 at 00:23
@user1173922: No, locking isnt specifically for stdout. Thats just what your original example used it for. Its used to synchronize access to any resource. For joining, you would have to first start all your processes, and then go through the process list and join on them. That will force your script to wait until every process has completed. – jdi Apr 06 '12 at 00:27

score 0 · Answer 3 · answered Apr 05 '12 at 23:18

0

If you are on a unix system, you could try running the 'top' command and looking at how many of your processes are showing up concurrently. Although it is somewhat empirical, many times just looking at the process list will allow you to see multiples.

Although looking at your script, I don't see where you are calling multiple processes. You can import multiprocessing.pool at then map your function to different processors.
http://docs.python.org/library/multiprocessing.html

answered Apr 05 '12 at 23:18

user574435

143
1
10

Thanks, that's the link the code comes from. Something strange though, running 'top' in terminal tells me this: Mem: ~3GB total, ~2.8 GB used. But when I tally the memory used by processes listed, I'm only using like 20% of available memory (I like to keep about 2 dozen tabs running in FF @ any single time). Is top not showing some stuff or something? Do I have a memory leak somewhere (didn't think this was really possible in Python b/c automatic garbage collection)? – user1173922 Apr 05 '12 at 23:28
Also, using multiprocessing.map does not run the functions on different processors. It only starts them in different processes. – jdi Apr 05 '12 at 23:30
1

FWIW, if you hit '1' in top, it toggles between a view showing you CPU totals for each core (YMMV, but personally I find it more satisfying to see 8 cores cranking at 99% CPU each than the default view showing 693% CPU usage). – timday Apr 05 '12 at 23:31
Oh yeah, also system monitor tells me I'm using bout 35% of memory, but that's still a big gap from what top tells me. Does top list only the memory that something's currently accessing, leaving things stored in memory which aren't being accessed alone? – user1173922 Apr 05 '12 at 23:32
Excellent timday, thanks so much, wish I could upvote comments (oh wait, I CAN!). Still hoping for an answer on the utilization of cores tho – user1173922 Apr 05 '12 at 23:36
Just trust your OS to schedule processes. Use top to confirm full utilisation (if it's not, you're probably IO bound, not CPU bound) Better to create too many processes rather than too few; Brendan's answer nails it. – timday Apr 05 '12 at 23:47
@timday: I think there is a limit to where too many processes can have a negative impact on performance when the overhead starts to outweigh the parallelism. There is always a sweet spot I think that is dependent on what your actual work is doing. – jdi Apr 05 '12 at 23:52

How do I tell if Python's Multiprocessing module is using all of my cores for calculations?

3 Answers3