1

I am trying and failing to run a huge loop in parallel. The loop is exactly one method of a specific class, and inside the loop I call its another method. It does work, but for some reason there is only one process in the list and the output (see code) is always 'Worker 0'. Either the processes are not created or they are not running in parallel. The structure is the following:

main.py

from my_class.py import MyClass

def main():
    class_object = MyClass()
    class_object.method()

if __name__ == '__main__':
    main()

my_class.py

from multiprocessing import Process

MyClass(object):
    def __init__(self):
        # do something

    def _method(self, worker_num, n_workers, amount, job, data):
        for i, val in enumerate(job):
            print('Worker %d' % worker_num)
            self.another_method(val, data)

    def another_method(self):
        # do something to the data

    def method(self):
        # definitions of data and job_size go here

        n_workers = 16
        chunk = job_size // n_workers
        resid = job_size - chunk * n_workers

        workers = []
        for worker_num in range(n_workers):
            st = worker_num * chunk
            amount = chunk if worker_num != n_workers - 1 else chunk + resid
            worker = Process(target=self._method, args=[worker_num, n_workers, amount, job[st:st+amount], data])
            worker.start()
            workers.append(worker)

        for worker in workers:
            worker.join()

        return data

I have read some things about child processes requiring main module to be importable, but I have no idea how to do it in my case.

stovfl
  • 14,998
  • 7
  • 24
  • 51
ne3x7
  • 13
  • 3
  • Beside missing `job_size, job, data` im running your Example and worked as expected. Output _Worker 0 up to 15_. It seems you are reinvent the wheel, are you aware of [Process Pools](https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool) – stovfl Aug 24 '17 at 12:52
  • @stovfl yes, it does work sequentially (though it probably should mess up the order), but I want to use it in parallel on multiple cores to speed up computation. I was not quite aware with Process Pools, thank you. – ne3x7 Aug 24 '17 at 13:58
  • I don't think that your creation of the workers is wrong. As you say, one worker is created and running. So your class method is correctly transfered to the worker. However, I usually create all the workers before I start them. So I have a small loop creating the workers and then a loop starting the workers. – RaJa Aug 24 '17 at 14:59
  • @stovfl First, thanks for your help, I tried printing pids and they actually are different, but still only one core is in use. So the question is, can I use multiple cores with Process objects or do I need to use Pools? – ne3x7 Aug 24 '17 at 19:38

1 Answers1

0

Question: ... but still only one core is in use. So the question is, can I use multiple cores with Process objects

This does not depend on the Python interpreter which Process is using which CPU.
Relevant: on-what-cpu-cores-are-my-python-processes-running

Extend your def _method(... with the following, to see what actually happens:

Note: getpidcore(pid) is Distribution dependend, could FAIL!

def getpidcore(pid):
    with open('/proc/{}/stat'.format(pid), 'rb') as fh:
        core = int(fh.read().split()[-14])
        return core

class MyClass(object): 
    ...
    def _method(self, worker_num, n_workers, amount, job, data):
        for i, val in enumerate(job):
            core = getpidcore(os.getpid())
            print('core:{} pid:{} Worker({})'.format(core, os.getpid(), (worker_num, n_workers, amount, job)))

Output:

core:1 pid:7623 Worker((0, 16, 1, [1]))
core:1 pid:7625 Worker((2, 16, 1, [3]))
core:0 pid:7624 Worker((1, 16, 1, [2]))
core:1 pid:7626 Worker((3, 16, 1, [4]))
core:1 pid:7628 Worker((5, 16, 1, [6]))
core:0 pid:7627 Worker((4, 16, 1, [5]))

Tested with Python: 3.4.2 on Linux

stovfl
  • 14,998
  • 7
  • 24
  • 51
  • I wasn't able to reproduce such behaviour in my case and gave up in favour of mpi4py. Thanks anyway. – ne3x7 Aug 25 '17 at 18:47