Python multiprocessing utilizes only one core

Question

I'm trying out a code snippet from the standard python documentation to learn how to use the multiprocessing module. The code is pasted at the end of this message. I'm using Python 2.7.1 on Ubuntu 11.04 on a quad core machine (which according to the system monitor gives me eight cores due to hyper threading)

Problem: All workload seems to be scheduled to just one core, which gets close to 100% utilization, despite the fact that several processes are started. Occasionally all workload migrates to another core but the workload is never distributed among them.

Any ideas why this is so?

Best regards,

Paul

#
# Simple example which uses a pool of workers to carry out some tasks.
#
# Notice that the results will probably not come out of the output
# queue in the same in the same order as the corresponding tasks were
# put on the input queue.  If it is important to get the results back
# in the original order then consider using `Pool.map()` or
# `Pool.imap()` (which will save on the amount of code needed anyway).
#
# Copyright (c) 2006-2008, R Oudkerk
# All rights reserved.
#

import time
import random

from multiprocessing import Process, Queue, current_process, freeze_support

#
# Function run by worker processes
#

def worker(input, output):
    for func, args in iter(input.get, 'STOP'):
        result = calculate(func, args)
        output.put(result)

#
# Function used to calculate result
#

def calculate(func, args):
    result = func(*args)
    return '%s says that %s%s = %s' % \
        (current_process().name, func.__name__, args, result)

#
# Functions referenced by tasks
#

def mul(a, b):
    time.sleep(0.5*random.random())
    return a * b

def plus(a, b):
    time.sleep(0.5*random.random())
    return a + b


def test():
    NUMBER_OF_PROCESSES = 4
    TASKS1 = [(mul, (i, 7)) for i in range(500)]
    TASKS2 = [(plus, (i, 8)) for i in range(250)]

    # Create queues
    task_queue = Queue()
    done_queue = Queue()

    # Submit tasks
    for task in TASKS1:
        task_queue.put(task)

    # Start worker processes
    for i in range(NUMBER_OF_PROCESSES):
        Process(target=worker, args=(task_queue, done_queue)).start()

    # Get and print results
    print 'Unordered results:'
    for i in range(len(TASKS1)):
       print '\t', done_queue.get()

    # Add more tasks using `put()`
    for task in TASKS2:
        task_queue.put(task)

    # Get and print some more results
    for i in range(len(TASKS2)):
        print '\t', done_queue.get()

    # Tell child processes to stop
    for i in range(NUMBER_OF_PROCESSES):
        task_queue.put('STOP')

test()

This post might be of assistance to you, http://stackoverflow.com/questions/5784389/using-100-of-all-cores-with-python-multiprocessing — Devraj, Aug 01 '11 at 22:34
Copy and Pasted your code, Maxed out a Intel Pentium D 3.4GHZ in two/two proc screens. — TelsaBoil, Aug 01 '11 at 22:41
With those sleeps() in there, this will not generate much CPU load at all. — nos, Aug 01 '11 at 23:06
True. I've removed the sleeps and increased the number of jobs to 10000, but the workload is still never distributed among the cores. If I start 4 processes I get three sleeping processes and one fully utilized. Thanks for helping out guys, but I couldn't get much out of Devrajs link. I understand that starting up 4 processes is no guarantee that they will be divided among the cores, but the cause of this polarized behavior that all processes except one is sleeping is not clear to me. — Paul, Aug 01 '11 at 23:23
Still, with 10000 jobs, isn't the work done here gone pretty quickly ? When I do some measuring on this program (be sure to remove all your loops that print something, you do **not** want to measure printing to the screen...), this program spends an awful lot of time doing the list comprehension, and stuffing things on the task_queue. That takes 100% CPU of one processor for a while. When you actually start the workers, things start to use the other processors But with only 10000 itmes, that'll be done on just a second or two, try 200000. — nos, Aug 02 '11 at 08:56
@Paul: first of all, create a new Python script and design a function that runs on a single thread/process, that takes 100% CPU when running, and that takes a few seconds to complete. After that, use this function with `multiprocessing`. — Andrea Corbellini, Feb 04 '15 at 17:33

score 3 · Answer 1 · answered Aug 11 '14 at 12:59

Try replacing the time.sleep with something that actually requires CPUs and you will see the multiprocess works just fine! For example:

def mul(a, b):
    for i in xrange(100000):
        j = i**2
    return a * b

def plus(a, b):
    for i in xrange(100000):
        j = i**2
    return a + b

score 2 · Answer 2 · answered Oct 08 '13 at 02:41

2

Some how the CPU affinity has been changed. I had this problem with numpy before. I found the solution here http://bugs.python.org/issue17038#msg180663

answered Oct 08 '13 at 02:41

iampat

1,072
1
12
23

score 0 · Answer 3 · edited May 23 '17 at 10:31

0

multiprocessing does not mean you'll use all cores of a processor, you just get multiple processes and not multi-core processes, this would be handled by the OS and is uncertain, the question @Devraj posted on comments has answers to accomplish what you desire.

edited May 23 '17 at 10:31

Community

1
1

answered Aug 01 '11 at 22:41

BrainStorm

2,036
1
16
23

score -1 · Answer 4 · answered Jan 04 '14 at 07:01

-1

I have found a work around using Parallel Python. I know this is not the solution using basic Python libraries, but the code is simple and works like a charm

answered Jan 04 '14 at 07:01

Gokool

1

I found some good examples of how to set up a SMP machine here http://www.parallelpython.com/content/view/17/31/ – Gokool Jan 24 '14 at 00:47

Python multiprocessing utilizes only one core

4 Answers4

Linked