1

I want to use concurrent.futures together with numpy to manipulate two scipy.sparse matrices:

matrix_A = scipy.sparse.lil_matrix((1000, 1000), dtype=np.float32) 
matrix_B = scipy.sparse.lil_matrix((500, 1000), dtype=np.float32) 

The algorithm works like this: every row in matrix_B has a one-to-many relationship to rows in matrix_A. For every row_B in matrix_B, I find its corresponding [row_A1, row_A2 ... row_An ] in matrix_A, sum them up and assign the sum to row_B.

def update_values(row):
    indices, values = find_rows_in_matrix_A(row)
    matrix_B[row, indices] = values

The matrices are large (10^7 rows), and I'd like to run this operation in parallel:

with concurrent.futures.ProcessPoolExecutor(max_workers=32) as executor:
     futures = {row : executor.submit(update_values, row) 
                for row in range(matrix_B.shape[0])}

But this doesn't work because changes made by child processes to global variables will be invisible to the parent process (as mentioned in this answer).

Another option would be to return the values from update_values, but that would require merging the values in the parent process, which takes too long for my use case.

Using multiprocessing.Manager.Array could be a solution, but that would create copies of the matrices at every write, and given their size, that's not an option.

Is there any way to make matrix_B writeable from child processes? Or what would be a better approach to this problem?

Community
  • 1
  • 1
hbot
  • 724
  • 8
  • 19
  • Is `matrix_A` stored on disk? Or built when the algorithm starts? What about the list of row relationships? Also, this seems to be CPU-bound work, so 32 workers is likely going to slow you down unless you actually have a 32-core machine. – bnaecker Jan 06 '17 at 01:45
  • How much do you know about the format and math of sparse matrices? Do you know how they differ from numpy arrays? It would be good, I think, to demonstrate this action without the multiprocessing, and may be even without sparsity. Then add the sparsity and multiprocessing one by one. – hpaulj Jan 06 '17 at 02:28
  • @bnaecker I have a 32 core machine. `matrix_A` is built when the algorithm starts, same with the list of row relationships. – hbot Jan 06 '17 at 13:24
  • @hpaulj what do you mean by `demonstrate this action`? When developing the algorithm, I initially single thread/process with numpy array, the algorithm works as expected. – hbot Jan 06 '17 at 13:27

0 Answers0