1

I have this code below, I already optimised the algorithm to make it as fast as possible but it is still too slow. So a was thinking about using multiprocessing (I have no expierience with this kind of stuff), but I tried some things with pool and threading but either it was slower than before or didn't work. So is was wondering how I should do this so that it works and is faster. And if there are other options than multithreading to make kind of code this faster.

def calc(indices, data):
    matrix = [[0] * len(indices) for i in range(len(indices))]
    for i_a, i_b in list(itertools.combinations(indices, 2)):
        a_res, b_res = algorithm(data[i_a], data[i_b])
       matrix[i_b][i_a] = a_res
       matrix[i_a][i_b] = b_res
    return matrix


def algorithm(a,b):
   # Verry slow and complex
Loïc Noest
  • 125
  • 1
  • 11

2 Answers2

1

Building upon Simon's answer, here is an example applying a multiprocessing pool to a version of your problem. Your mileage will vary depending on how many cores you have on your machine but I hope that this will be a helpful demonstration of how you could structure a solution to your problem:

import itertools
import numpy as np
import multiprocessing as mp
import time

def calc_mp(indices, data):
    # construct pool
    pool = mp.Pool(mp.cpu_count())

    # we are going to populate the matrix; organize all the inputs; then map them
    matrix = [[0] * len(indices) for i in range(len(indices))]
    args = [(data[i_a], data[i_b]) for i_a, i_b in list(itertools.combinations(indices, 2))]
    results = pool.starmap(algorithm, args)

    # unpack the results into the matrix
    for i_tuple, result in zip([(i_a, i_b) for i_a, i_b in list(itertools.combinations(indices, 2))], results):
        # unpack
        i_a, i_b = i_tuple
        a_res, b_res = result

        # set it in the matrix
        matrix[i_b][i_a] = a_res
        matrix[i_a][i_b] = b_res

    return matrix

def calc_single(indices, data):
    # do the simple single process version
    matrix = [[0] * len(indices) for i in range(len(indices))]
    for i_a, i_b in list(itertools.combinations(indices, 2)):
        a_res, b_res = algorithm(data[i_a], data[i_b])
        matrix[i_b][i_a] = a_res
        matrix[i_a][i_b] = b_res

    return matrix

def algorithm(a,b):
    # Very slow and complex
    time.sleep(2)
    return a + b, a - b

if __name__ == "__main__":
    # generate test data;
    indices = range(5)
    data = range(len(indices))

    # test single
    time_start = time.time()
    print(calc_single(indices, data))
    print("Took {}".format(time.time() - time_start))

    # mp
    time_start = time.time()
    print(calc_mp(indices, data))
    print("Took {}".format(time.time() - time_start))

The results, with 8 cores, are

[[0, -1, -2, -3, -4], [1, 0, -1, -2, -3], [2, 3, 0, -1, -2], [3, 4, 5, 0, -1], [4, 5, 6, 7, 0]]
Took 20.02155065536499
[[0, -1, -2, -3, -4], [1, 0, -1, -2, -3], [2, 3, 0, -1, -2], [3, 4, 5, 0, -1], [4, 5, 6, 7, 0]]
Took 4.073369264602661
Paul
  • 5,473
  • 1
  • 30
  • 37
  • Thanks a lot! This really helped me out. Because I was looking in to how to make multiprocessing work, but I just didn't think about just filling in the results in the matrix after the whole .starmap() had been completed. – Loïc Noest Jun 01 '18 at 19:29
0

Your best bet in Multiprocessing. You will need to partition your data into chunks and pass each chunk to a process. Threading won't help you in Python because all Python processes run on a single cpu thread. It's still useful for some use cases, such as where you have several activities going on some of which might block, but not for parallel workloads.

Simon Hibbs
  • 5,941
  • 5
  • 26
  • 32