I am running this very simple example on Ubuntu using python 2.7.
#!/usr/bin/python
import threading
threads = []
def incrementCounter(tid):
print("Thread " + str(tid) + " is starting")
counter = 0;
for i in range(5000000):
counter = counter + 1
print("Thread " + str(tid) + " is done")
def main():
for i in range(24):
t = threading.Thread(target=incrementCounter, args=(i, ))
t.start();
threads.append(t)
for i in range(24):
for t in threads:
t.join()
if __name__ == '__main__':
main()
I am running it on a computer with a 24-core processor. I get no thread parallelism when I run it. Vmstat, when run concurrently with this code, reveals that 95% of the time the CPU is idle. On the top it shows the CPU utilization of 150%, which means we are fully using 1.5 cores. When I hit Ctrl-C in the main window, my CPU utilization goes to 2398%, which is almost 100% for each core. However, running vmstat shows that this is system CPU usage. The used CPU is in the single digits. The program takes forever to complete.
When I replace threading with processes as in:
p = Process(target=incrementCounter, args=(i, ))
p.start();
processes.append(p)
(and with all other changes as expected), I get full parallelism and the program completes instantaneously.
It appears that there is no parallelism with threading.Threads. vmstat output also shows tremendous amount of interrupts and context switches, which suggests that threads are being run sequentially.
Anybody has a clue what's going on?