threads not able to reduce the run time of two function when run at once

Question

I have two functions f1 and f2 which increment an integer specific number of times in a loop inside these two functions.

Two ways I call these functions.

1) One by one, that is first f1 then f2. 2) Create a thread t1 to run function f1 and thread t2 to run function f2.

As soon in the code below, I have tried both the ways.

from threading import Thread
import time
import datetime
from queue import Queue

def f1(a):
    for i in range(1,100000000):
        a+=1
    return a

def f2(a):
    for i in range(1,100000000):
        a+=1
    return a
if __name__ == '__main__':

    que1 = Queue()
    que2 = Queue()

    # t2 = Thread(target=f1(a),name='t2')
    a = 0
    s_t = time.time()
    print('Value of a, before calling function f1: ',a)
    a=f1(a)
    print('Value of a, after calling function f1: ',a)
    a = 0
    print('Value of a, before calling function f2: ',a)
    a=f2(a)
    print('Value of a, after calling function f2: ',a)
    print('Time taken without threads: ',datetime.timedelta(seconds=time.time()-s_t))

    s_t = time.time()
    a = 0
    print('Value of a, before calling function f1 through thread t1: ',a)

    t1 = Thread(target=lambda q, arg1: q.put(f1(arg1)), args=(que1,a),name = 't1')
    print('Value of a, before calling function f2 through thread t2: ',a)

    t2 = Thread(target=lambda q, arg1: q.put(f2(arg1)), args=(que2,a),name = 't2')

    t1.start()
    t2.start()
    t1.join()
    print('Value of a, after calling function f1 through thread t1: ',que1.get())
    t2.join()
    print('Value of a, after calling function f2 through thread t2: ',que2.get())
    print('Time taken with threads: ',datetime.timedelta(seconds=time.time()-s_t))

Expected threads to do the jobs faster than calling the functions one after the other but it's not the case here.

Here's the output

Value of a, before calling function f1:  0
Value of a, after calling function f1:  99999999
Value of a, before calling function f2:  0
Value of a, after calling function f2:  99999999
Time taken without threads:  0:00:07.623239
Value of a, before calling function f1 through thread t1:  0
Value of a, before calling function f2 through thread t2:  0
Value of a, after calling function f1 through thread t1:  99999999
Value of a, after calling function f2 through thread t2:  99999999
Time taken with threads:  0:00:27.274876

What is going wrong?

In `python`, only a `single` `thread` can run at a `time`, because of the `GIL(Global Interpreter Lock)`. So you running `thread`s for `cpu` intensive operation is `useless` in `python` — han solo, Sep 19 '19 at 06:38
You can use `concurrent.futures.ProcessPoolExecutor` as a work around — han solo, Sep 19 '19 at 06:40

han solo · Accepted Answer · 2019-09-19T08:43:01.223

1

In python, only a single thread can run at a time, because of the GIL(Global Interpreter Lock). What is a GIL?. So running threads for cpu intensive operation is not very useful in python. But threads are great for I/O. I hope, i clarified :)

Assuming python3, you could use ProcessPoolExecutor from concurrent.futures like,

$ cat cpuintense.py
import time
from concurrent.futures import ProcessPoolExecutor


def f1(a):
    for i in range(1,100000000):
        a+=1
    return a

def f2(a):
    for i in range(1,100000000):
        a+=1
    return a

def run_in_sequence(a):
    start = time.time()
    f1(a)
    f2(a)
    end = time.time()
    print(f'[Sequential] Took {end-start} seconds')

def run_in_parallel(a):
    with ProcessPoolExecutor(max_workers=2) as pool:
        start = time.time()
        fut1 = pool.submit(f1, a)
        fut2 = pool.submit(f2, a)
        for fut in (fut1, fut2):
            print(fut.result())
        end = time.time()
        print(f'[Parallel] Took {end-start} seconds')


if __name__ == '__main__':
    a = 0
    run_in_sequence(a)
    run_in_parallel(a)

Output:

$ python3 cpuintense.py
[Sequential] Took 6.838468790054321 seconds
99999999
99999999
[Parallel] Took 3.488879919052124 seconds

Note: The if __name__ == '__main__' guard is required for windows. From the docs the reason is,

Since Windows lacks os.fork() it has a few extra restrictions:

Safe importing of main module

Make sure that the main module can be safely imported by a new Python interpreter without causing unintended side effects (such a starting a new process).

For example, under Windows running the following module would fail with a RuntimeError:

from multiprocessing import Process

def foo():
    print 'hello'

p = Process(target=foo)
p.start()

Instead one should protect the “entry point” of the program by using if __name__ == '__main__': as follows:

from multiprocessing import Process, freeze_support

def foo():
    print 'hello'

if __name__ == '__main__':
    freeze_support()
    p = Process(target=foo)
    p.start()

(The freeze_support() line can be omitted if the program will be run normally instead of frozen.)

This allows the newly spawned Python interpreter to safely import the module and then run the module’s foo() function.

Similar restrictions apply if a pool or manager is created in the main module.

edited Sep 19 '19 at 08:43

answered Sep 19 '19 at 06:46

han solo

6,390
1
15
19

I get this error when i run your code `concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.` – Santhosh Dhaipule Chandrakanth Sep 19 '19 at 07:01
Same code that you have suggested in your answer, as it is. – Santhosh Dhaipule Chandrakanth Sep 19 '19 at 07:05
@SanthoshDhaipuleChandrakanth That's weird :/. Could you paste the code and the traceback ? – han solo Sep 19 '19 at 07:06
You can check it in my question – Santhosh Dhaipule Chandrakanth Sep 19 '19 at 07:09
I'm running Python 3.7.4 on WIndows, – martineau Sep 19 '19 at 07:11
@SanthoshDhaipuleChandrakanth Are you running on windows too ? – han solo Sep 19 '19 at 07:12
Yup ran on `python 3.7.2` and `python 3.6.4` on Windows 10 same error – Santhosh Dhaipule Chandrakanth Sep 19 '19 at 07:15
Okay. I see there's some `guard` issue in `windows` [issue](https://stackoverflow.com/questions/15900366/all-example-concurrent-futures-code-is-failing-with-brokenprocesspool). Could you run with the updated code ? – han solo Sep 19 '19 at 07:16
Something else weird is that before the error occurs, two `[Sequential] Took xxx.xx seconds` messaged are printed with slightly different values. – martineau Sep 19 '19 at 07:17
@martineau Oh, interesting. Let me see if i can get any windows to run this code – han solo Sep 19 '19 at 07:18
3

It will work if you put the `if __name__ == '__main__':` guard in before the last three statements (and indent them). – martineau Sep 19 '19 at 07:20
@martineau. Yeah, already updated. It is weird. One of the interesting things i came across yet :) – han solo Sep 19 '19 at 07:21
1

It's because of how processes are started on Windows (which is different than on *nixes). – martineau Sep 19 '19 at 07:25

threads not able to reduce the run time of two function when run at once

1 Answers1

Linked