Python: Understanding Threading Module

Question

While learning Python's threading module I've run a simple test. Interesting that the threads are running sequentially and not parallel. Is it possible to modify this test code so a program executes the threads in same fashion as multiprocessing does: in parallel?

import threading

def mySlowFunc(arg):
    print "\nStarting...", arg
    m=0
    for i in range(arg):
        m+=i
    print '\n...Finishing', arg

myList =[35000000, 45000000, 55000000]

for each in myList:
    thread = threading.Thread(target=mySlowFunc, args=(each,) )
    thread.daemon = True
    thread.start()
    thread.join()

print "\n Happy End \n"

REVISED CODE:

This version of the code will initiate 6 'threads' running in 'parallel'. But even while there will be 6 threads only two CPU's threads are actually used (6 other Physical CPU threads will be idling and doing nothing).

import threading

def mySlowFunc(arg):
    print "\nStarting " + str(arg) + "..."
    m=0
    for i in range(arg):
        m+=i
    print '\n...Finishing ' + str(arg)

myList =[35000000, 45000000, 55000000, 25000000, 75000000, 65000000]


for each in myList:
    thread = threading.Thread(target=mySlowFunc, args=(each,) )
    thread.daemon = False
    thread.start()

print "\n Bottom of script reached \n"

As mentioned, the gil in Python limits thread performance to one simultaneous thread. If you need true parallelism, you will need multiprocessing, or an alternative implementation like jython. — Max, Mar 15 '14 at 02:00
If you are using python 2.x, you should use `xrange` instead of `range` for the `mySlowFunc` `for` loop. `range` in python 2.x constructs a list of the specified size. — jrennie, Mar 15 '14 at 02:55

Augusto Hack · Accepted Answer · 2014-03-15T01:43:13.593

From the docs for the join method:

Wait until the thread terminates. This blocks the calling thread until the thread whose join() method is called terminates – either normally or through an unhandled exception – or until the optional timeout occurs.

Just create a list of threads and join them after launching every single one of them.

Edit:

The threads are executing in parallel, you can think of python's threads like a computer with a single core, the thing is, python's threads are best for IO operations (reading/writing a big file, sending data through a socket, that sort of thing). If you want CPU power you need to use the multiprocessing module

Sounds like all we need is to amend thread.join() statement which forces to wait each thread to be completed before starting a new... — alphanumeric, Mar 15 '14 at 00:49

score -1 · Answer 2 · answered Mar 15 '14 at 01:08

-1

If python didn't have the GIL, you ought to be able to see true parallelism by changing your code to only join after you have started all threads:

threads = []
for each in myList:
  t = threading.Thread(target=mySlowFunc, args=(each,) )
  t.daemon = True
  t.start()
  threads.append(t)
for t in threads:
  t.join()

With the above code in python, you should at least be able to see interleaving: thread #2 doing some work before thread #1 has completed. But, you won't see genuine parallelism. See the GIL link for more background.

answered Mar 15 '14 at 01:08

jrennie

1,937
12
16

Thanks! I actually removed t.join() statement from the code and run it again. Interesting, that I've got 200% CPU utilization with this approach on 800% CPU capable machine. It appears that a single physical CPU is used with both its threads (a reason it shows 200%). – alphanumeric Mar 15 '14 at 01:14
jrennie, I don't think you achieve anything by appending each thread to a list variable to be used later to run a t.join() method. You could simply put t.join() inside of 'for each in myList:' scope. Correct me if I am wrong. – alphanumeric Mar 15 '14 at 01:21
Down vote - Bad advice about the GIL, the OP does not understand the basic threading model he should be worried about a implementation detail of CPython – Augusto Hack Mar 15 '14 at 01:22
@hack.augusto I respectfully disagree, but I greatly appreciate that you provided an explanation for the downvote. Thank you! – jrennie Mar 15 '14 at 01:27
@Sputnix, the `join` method blocks until the thread execution ends, if you join the thread before `start`ing a new one you are in fact running sequential code. (I made a typo above, should read "shouldn't" instead of "should", can't edit the comment anymore) – Augusto Hack Mar 15 '14 at 01:28
@jrennie, well, I'm just worried that people new to python get the wrong "python can not do true parallelism" impression, because they do not understand the GIL, but I might be wrong, the OP is wondering why it's code does not use all of its cores. People just don't realise that python's threads *do* work in parallel but with a single thread executing at a time. – Augusto Hack Mar 15 '14 at 01:38
@hack.a: There is a difference in saying: 'they are unaware of GIL' and 'they do not understand the GIL'. I would prefer the first. Thanks in advance! – alphanumeric Mar 15 '14 at 01:47
@hack.augusto Python's threads *sit* in parallel :) Parallelism refers to parallel processing which is defined as "the ability to carry out multiple operations or tasks simultaneously" which is what native python threads can't do. So, I don't see why you're afraid of people getting the impression that python can't do true parallelism. – jrennie Mar 15 '14 at 01:48
Would you guys please clarify it for me: why do I see only 2 of 8 physically available CPU threads are used on my machine when a code initiates 6 Python threads at the same time? Why doesn't each Python (software) thread take an available physical core(thread)? – alphanumeric Mar 15 '14 at 01:51
@Sputnix Because of the GIL. David Beazley's presentation is well worth watching/reading IMO http://www.dabeaz.com/python/GIL.pdf – jrennie Mar 15 '14 at 01:56
Thanks for the link. But once again: why do I see only 2 of 8 physically available CPU threads are used on my machine when a code initiates 6 Python threads at the same time? Why doesn't each Python (software) thread take an available physical core(thread)? – alphanumeric Mar 15 '14 at 02:12
@Sputnix I think I now know the answer to that: the default check interval (see slide 17 of the Beazley talk). The default is 100, which means that every 100 "instructions", the interpreter is retaking control. When I sys.setcheckinterval(10000), I see cpu usage drop dramatically to just over 1 full cpu. – jrennie Mar 15 '14 at 02:30

Python: Understanding Threading Module

REVISED CODE:

2 Answers2

Linked