0

I do some computationally expensive tasks in python and found the thread module for parallelization. I have a function which does the computation and returns a ndarray as result. Now I want to know how I can parallize my function and get back the calculated Arrays from each thread.

The followed example is strongly simplified with light functions and calculations.

import numpy as np

def calculate_result(input):
    a=np.linspace(1.0, 1000.0, num=10000)   # just an example
    result = input*a
  return(result)

input =[1,2,3,4]

for i in range(0,len(input(i))):
    t.Thread(target=calculate_result, args=(input))
    t. start()  
    #Here I want to receive the return value from the thread

I am looking for a way to get the return value from the thread / function for each thread, because in my task each thread calculates different values.

I found an other Question (how to get the return value from a thread in python?) where someone is looking for a similar problem (no ndarrays) and which is handled with ThreadPool and async...

-------------------------------------------------------------------------------

Thanks for your answers ! Due to your help now I am looking for a way to solve my problem with the multiprocessing modul. To give you a better understanding what I do, see my following Explanation.

Explanation:

My 'input_data' is an ndarray with 282240 elements of type uint32

In the 'calculation_function()'I use a for loop to calculate from every 12 bit a result and put it into the 'output_data'

Because this is very slow, I split my input_data into e.g. 4 or 8 parts and calculate each part in the calculation_function().

Now I am looking for a way, how to parallize the 4 or 8 function calls

The order of the data is elementary, because the data is in image and each pixel have to be at the correct Position. So function call no. 1 calculates the first and the last function call the last pixel of the image.

The calculations work fine and the image can be completly rebuilt from my algo but I need the parallelization to speed up for time critical aspects.

Summary: One input ndarray is devided into 4 or 8 parts. In each part are 70560 or 35280 uint32 values. From each 12 bit I calculate one Pixel with 4 or 8 function calls. Each function returns one ndarray with 188160 or 94080 pixel. All return values will be put together in a row and reshaped into an image.

What allready works: Calculations are allready working and I can reconstruct my image

Problem: Function calls are done seriall and in a row but each image reconstruction is very slow

Main Goal: Speed up the function calls by parallize the function calls.

Code:

def decompress(payload,WIDTH,HEIGHT):
    # INPUTS / OUTPUTS
    n_threads = 4                                                                           
    img_input = np.fromstring(payload, dtype='uint32')                                      
    img_output = np.zeros((WIDTH * HEIGHT), dtype=np.uint32)                            
    n_elements_part = np.int(len(img_input) / n_threads)                                    
    input_part=np.zeros((n_threads,n_elements_part)).astype(np.uint32)                      
    output_part =np.zeros((n_threads,np.int(n_elements_part/3*8))).astype(np.uint32)        

    # DEFINE PARTS (here 4 different ones)
    start = np.zeros(n_threads).astype(np.int)                          
    end = np.zeros(n_threads).astype(np.int)                            
    for i in range(0,n_threads):
        start[i] = i * n_elements_part
        end[i] = (i+1) * n_elements_part -1

    # COPY IMAGE DATA
    for idx in range(0,n_threads):
        input_part [idx,:] = img_input[start[idx]:end[idx]+1]


    for idx in range(0,n_threads):                          # following line is the function_call that should be parallized
        output_part[idx,:] = decompress_part2(input_part[idx],output_part[idx])



    # COPY PARTS INTO THE IMAGE
    img_output[0     : 188160] = output_part[0,:]
    img_output[188160: 376320] = output_part[1,:]
    img_output[376320: 564480] = output_part[2,:]
    img_output[564480: 752640] = output_part[3,:]

    # RESHAPE IMAGE
    img_output = np.reshape(img_output,(HEIGHT, WIDTH))

    return img_output

Please dont take care of my beginner programming style :) Just looking for a solution how to parallize the function calls with the multiprocessing module and get back the return ndarrays.

Thank you so much for your help !

HKC72
  • 502
  • 1
  • 8
  • 22

2 Answers2

1

You can use process pool from the multiprocessing module

        def test(a):
           return a

        from multiprocessing.dummy import Pool
        p = Pool(3)
        a=p.starmap(test, zip([1,2,3]))
        print(a)
        p.close()
        p.join()
kar
  • 198
  • 1
  • 8
  • What does the 3 stand for in the Pool function? Number of processes? – HKC72 Nov 02 '17 at 15:02
  • 1
    yes thats correct . check this https://docs.python.org/2/library/multiprocessing.html . kindly upvote if you found my answer useful . thanks :) – kar Nov 02 '17 at 15:04
0

kar's answer works, however keep in mind that he's using the .dummy module which might be limited by the GIL. Heres more info on it: multiprocessing.dummy in Python is not utilising 100% cpu