Is it possible to run a program dependant on the previous task on multiple cores?

Question

I have a very large python program that takes a few hours to complete, running on one core.

Is there any way that I can split this between all 8 i7 cores on my pc?

The only problem is that the task is dependant on the previous calculation, for example, here is a (very) simplified version of my code:

def code(nums):
    num = 1
    for loop in range(nums):
        num += num * loop
    return num

By then running code() with

counted = code(100000)

I can watch on Task Manager how much CPU is being used (around 12%) showing it is only using one core

Is there a way to make this code run over multiple cores?

Note: The main difficulty I have had is that it is dependant on the previous result, therefore I haven't been able to split the code into multiple parts.

Edit:

I am using Python3 on Windows 10

Edit 2:

Please, if possible, can you write the solution for if the code above was my complete code?

Based on the example function you have, you can split up the calculation from 0 to N/2, and from N/2 + 1 to N, (if you want to use 2 processors). Then sum up at the end. There is a multiprocessing package in Python that can get it to run on two cores. For 8 cores, simply split up the calculation into 1/8th of the full chunk. — Spinor8, Aug 31 '17 at 08:10
@Spinor8 Is separating it like that the only way to make it run on multiple cores? I'm only asking because it will be **very** difficult to split my program. It's very complicated — Rlz, Aug 31 '17 at 08:13
Unfortunately, you're asking for a specific multiprocessing solution for a specific program, but only providing an abstract general example. It is entirely possible that a rewrite could remove some of the interdependence within the program, but without a more detailed look at the code there isn't a lot we can help with. — Jake Conkerton-Darby, Aug 31 '17 at 08:17
@JakeConkerton-Darby How about if I put it like this: Is there a way to make any program run with multiple cores using the same method? e.g. If I ran it in IPython it would do it with all 8 cores (Something like that, I don't think that works haha) — Rlz, Aug 31 '17 at 08:18
Not by default no. The only way (as far as I'm aware) to get python to run across multiple cores is to break the program down into sub processes using the `multiprocess` module in python. This will allow the parts that you split your program into to each run independently and utilise the multiple cores. — Jake Conkerton-Darby, Aug 31 '17 at 08:20
@JakeConkerton-Darby So if that's the only way, if in a few minutes no one has found a solution can you please post this as the answer and say that's the only way — Rlz, Aug 31 '17 at 08:22
Yeah, I agree with @Jake. Without the specifics, it's hard to tell. There are packages such as numba that can speed up your process without rewriting the function. But if you can to fully utilize the 8 cores, you will probably need to split up the function. Sometimes it can involve rewiring the whole calculation. — Spinor8, Aug 31 '17 at 08:23
@RulerOfTheWorld That's probably not worth an answer...The problem here is a bit too broad. Yes, in Python, the only way to get CPU-intensive operations to execute on many cores is multiprocessing, and the easiest way to achieve it is to make all your processes run the same operations, each on its own part of the common data. — Right leg, Aug 31 '17 at 08:24
@Rightleg So lets say that was my complete code. Can you guys write an answer for that? — Rlz, Aug 31 '17 at 08:25
As to your most recent comment: please read through https://docs.python.org/3.6/library/multiprocessing.html and see if you'd still be stuck creating a solution for your current example code. — , Aug 31 '17 at 08:30
@RulerOfTheWorld I've added a full reply which includes a solution for parallelising your proposed code, as well as a step by step (roughly) of the thought process regarding how I parallelised it. — Jake Conkerton-Darby, Aug 31 '17 at 10:11
@RulerOfTheWorld Have you seen my answer below, and does it help? — Jake Conkerton-Darby, Aug 31 '17 at 15:41

Jake Conkerton-Darby · Accepted Answer · 2017-08-31T10:32:23.107

Multiprocessing in Python

CPython, the most common implementation of the Python Interpreter, does not have native support for multiprocessing without using the multiprocessing module of python. This is due to the Global Interpreter Lock (GIL) introduced as python memory is not thread safe, and thus even the threading module does not truly implement parallel code. The multiprocessing module on the other hand starts entirely new instances of the Python interpreter, and as they no longer share a GIL can truly be run in parallel across multiple cores.

Therefore if you wish to run a python program that takes advantage of multiple cores your must break up your program manually into sub-processes that each can be run on a separate instance of the interpreter. Do be careful though, as the gains from taking advantage of multiple cores in this way may cost more in overhead than is gained by running across cores, but this is very project dependant, so you should check this for yourself.

Sources:

Multicore Python: A tough, worthy, and reachable goal

Multi-Core and Distributed Programming in Python

Multiprocessing vs Threading Python

Possible Solution to your problem

Now bear with me for a while here, as there are a couple of mathematical steps, but I do believe that there is a way for your code to be parallelised.

First take example outputs for different values of nums, to see what the output of the code would be:

nums | 1 | 2 | 3 |  4 |   5 |   6 |
-----------------------------------
out  | 1 | 2 | 6 | 24 | 120 | 720 |

Now due to your code is set up you can see that code(nums) = code(nums - 1) + code(nums - 1) * (nums- 1), so if we can find a way to get from code(nums - 1) to code(nums) without actually needing to know the value of code(nums - 1) then we can paralellise the code.

If we now consider the difference of going from code(nums - 1) to code(nums) in the above table we get the following table of their differences:

nums | 1 | 2 | 3 |  4 |  5 |   6 |
-----------------------------------
diff |   | 1 | 4 | 18 | 96 | 600 |

Now these differences might not seem to form a pattern, but actually with a bit of a logical leap, you can see that the difference between code(nums) and code(nums - 1) is (nums - 1) * (nums - 1)! (where ! is the mathematical factorial function). Thus we see that your code method can equivalently be written as:

from math import factorial
def code(nums):
    num = 1
    for i in range(1, nums):
        num += i * factorial(i)
    return num

Now we are somewhere that we can parallelise your code from. We'll use the multiprocessing module with a pool of 8 processes (to match your 8 cores), and map a section of the work needed onto each core, allow them to compute their desired value and then total up the results, then simply adding 1 to the result to account for the initial value of num.

from math import factorial
from multiprocessing import Pool, freeze_support

NUM_CORES = 8

def foo(lower, upper):
    sum = 0
    for i in range(lower, upper):
        sum += i * factorial(i)
    return sum

def code(nums):
    # Build the list of arguments for the workers so that each gets 1/NUM_CORES of the work.
    a, b = divmod(nums, NUM_CORES)
    arguments = [(1,a)]
    for c in range(2, NUM_CORES):
        arguments.append(((c-1)*a, c*a))
    arguments.append(((NUM_CORES - 1)*a, NUM_CORES*a + b))

    print(arguments)

    with Pool(processes=8) as pool:
        results = pool.starmap(foo, arguments)

    print(1 + sum(results))

if __name__ == '__main__':
    freeze_support()
    code(100)

The freeze_support call in the __main__ block at the bottom is needed so that the multiprocessing is performed correctly in the code method.

Wow, amazing answer. Very in depth with an example too! Wow! I do have one question, why do you change the code to using factorial, can you not break down the original code and implement it directly into multiprocessing? — Rlz, Aug 31 '17 at 16:10
Directly, no. As you surmised your code is directly dependant upon the previous result for it's next result, and if this is truly the case it may be your code is unable to be parallelised. What I showed is that the specific example you gave can be parallelised by re-analysing the function and finding it need not be dependant upon a previous result for it's current result. You'll have to do a similar analysis with your actual program to determine if and where it can be broken down into sub-processes, but I hope I demonstrated that even code that appears to be purely serial can be parallelised. — Jake Conkerton-Darby, Aug 31 '17 at 16:15

score 1 · Answer 2 · answered Aug 31 '17 at 08:31

1

First, take a pen and a paper, write down the calculation progress in your mind. Is there any step can be paralleled?

If not, it is impossible to use multi core.

If yes, carefully check what do you need to parallel the steps, Does all steps can be paralleled depend on the same one step? Or they depend on different steps?

If yes... If not...

In one word, before you can parallel your calculation, first you need to build a topological graph for you calculation in order to find where can, what needed and how to.

answered Aug 31 '17 at 08:31

Sraw

18,892
11
54
87

Very good answer. Would you be able to provide the steps in order to run my example code with multiple processes? Then I can mark it as the answer :) – Rlz Aug 31 '17 at 08:41
Hello? Would that be possible? – Rlz Aug 31 '17 at 08:49
Hello? Please reply – Rlz Aug 31 '17 at 09:43

Is it possible to run a program dependant on the previous task on multiple cores?

2 Answers2

Multiprocessing in Python

Sources:

Possible Solution to your problem