Multi-threaded Python application slower than single-threaded implementation

Question

I wrote this program to properly learn how to use multi-threading. I want to implement something similar to this in my own program:

import numpy as np
import time
import os
import math
import random

from threading import Thread

def powExp(x, r):
    for c in range(x.shape[1]):
        x[r][c] = math.pow(100, x[r][c])

def main():
    print()
    rows = 100
    cols = 100

    x = np.random.random((rows, cols))
    y = x.copy()


    start = time.time()

    threads = []
    for r in range(x.shape[0]):
        t = Thread(target = powExp, args = (x, r))
        threads.append(t)
        t.start()
    for t in threads:
        t.join()

    end = time.time()

    print("Multithreaded calculation took {n} seconds!".format(n = end - start))

    start = time.time()

    for r in range(y.shape[0]):
        for c in range(y.shape[1]):
            y[r][c] = math.pow(100, y[r][c])

    end = time.time()

    print("Singlethreaded calculation took {n} seconds!".format(n = end - start))
    print()

    randRow = random.randint(0, rows - 1)
    randCol = random.randint(0, cols - 1)

    print("Checking random indices in x and y:")
    print("x[{rR}][{rC}]: = {n}".format(rR = randRow, rC = randCol, n = x[randRow][randCol]))
    print("y[{rR}][{rC}]: = {n}".format(rR = randRow, rC = randCol, n = y[randRow][randCol]))
    print()

    for r in range(x.shape[0]):
        for c in range(x.shape[1]):
            if(x[r][c] != y[r][c]):
                print("ERROR NO WORK WAS DONE")
                print("x[{r}][{c}]: {n} == y[{r}][{c}]: {ny}".format(
                        r = r,
                        c = c,
                        n = x[r][c],
                        ny = y[r][c]
                    ))
                quit()

    assert(np.array_equal(x, y))

if __name__ == main():
    main()

As you can see from the code the goal here is to parallelize the operation math.pow(100, x[r][c]) by creating a thread for every column. However this code is extremely slow, a lot slower than single-threaded versions.

Output:

Multithreaded calculation took 0.026447772979736328 seconds!
Singlethreaded calculation took 0.006798267364501953 seconds!

Checking random indices in x and y:
x[58][58]: = 9.792315687115973
y[58][58]: = 9.792315687115973

I searched through stackoverflow and found some info about the GIL forcing python bytecode to be executed on a single core only. However I'm not sure that this is in fact what is limiting my parallelization. I tried rearranging the parallelized for-loop using pools instead of threads. Nothing seems to be working.

Python code performance decreases with threading

EDIT: This thread discusses the same issue. Is it completely impossible to increase performance using multi-threading in python because of the GIL? Is the GIL causing my slowdowns?

EDIT 2 (2017-01-18): So from what I can gather after searching for quite a bit online it seems like python is really bad for parallelism. What I'm trying to do is parellelize a python function used in a neural network implemented in tensorflow...it seems like adding a custom op is the way to go.

Possible duplicate of [Python code performance decreases with threading](http://stackoverflow.com/questions/6821477/python-code-performance-decreases-with-threading) — Tagc, Jan 16 '17 at 12:29
@KlausD. Sorry, I missed that. I updated the code and the output. Still have the same issue despite not starting that ridiculous amount of threads! — LiquidFunk, Jan 16 '17 at 12:43
Yes, the GIL will mess with your performance. Can't you change your code to use multiprocessing? — yorodm, Jan 16 '17 at 13:30

score 0 · Answer 1 · answered Jan 16 '17 at 12:44

0

The number of issues here is quite... numerous. Too many (system!) threads that do too little work, the GIL, etc. This is what I consider a really good introduction to parallelism in Python:

https://www.youtube.com/watch?v=MCs5OvhV9S4

Live coding is awesome.

answered Jan 16 '17 at 12:44

Dervin Thunk

19,515
28
127
217

I updated the code and output moments before you posted. The insane amount of threads have been removed! I will check out the link though, thanks. – LiquidFunk Jan 16 '17 at 12:46

Multi-threaded Python application slower than single-threaded implementation

1 Answers1