I have some code that needs to work through a huge set of matrices (all 5
by 10
binary matrices in reduced row echelon form, with no zero rows) and either accept or reject each matrix depending on whether it satisfies some conditions. Since there's a lot to get through, I'm trying to use multiprocessing to speed things up. Here's roughly what my code currently looks like:
import multiprocessing as mp
import numpy as np
def check_valid(matrix):
# Perform some checks and things
if all_checks_passed:
return matrix.copy()
return None
subgroups = []
with mp.Pool() as pool:
subgroups_iter = pool.imap(
check_valid,
get_rref_matrices(5),
chunksize=1000
)
for item in subgroups_iter:
if item is not None:
subgroups.append(item)
get_rref_matrices
is a generator function that recursively finds all the rref matrices (I'm not sure if this function is causing any issues). The full code for this function is here, if that's of interest.
When I run the program, it seems to be very slow (hardly any faster than a single process) and the CPU usage is only about 10%. I've previously run code that has maxed out my CPU, so I'm stumped as to why this code isn't running faster.