How to efficiently calculate block mean (irregular blocks) for numpy 2D array?

Question

My question is related to Block mean of numpy 2D array and block mean of 2D numpy array (in both dimensions) (in fact it is just more general case). I will explain this on simple example.

Let's assume we have 6x6 2D array:

array([[7, 1, 6, 6, 4, 2],
       [8, 5, 5, 6, 3, 5],
       [3, 1, 7, 1, 3, 4],
       [6, 8, 3, 2, 3, 3],
       [8, 6, 7, 1, 1, 3],
       [8, 5, 4, 5, 1, 4]])

Now each row (and column) in this matrix is assigned to one of three communities (communities can be of different size) e.g. array([0, 0, 1, 1, 1, 2]) would represent this assignment. Now I need to split this matrix according this assignment and calculate mean over blocks (slices). This would produce 3x3 matrix of block means. For example block (or slice) for community pair (0,0) is an 2x2 array:

array([[7, 1],
       [8, 5]])

that has mean of 5.25. Block for community pair (0, 1) is an 2x3 array:

array([[6, 6, 4],
       [5, 6, 3]])

with mean 5, and so on..

Resulting array of block means should look like this:

array([[5.25      , 5.        , 3.5       ],
       [5.33333333, 3.11111111, 3.33333333],
       [6.5       , 3.33333333, 4.        ]])

My question is how to calculate this efficiently. For now I am using for loops – for each pair of communities I get proper slice, calculate mean over that slice and store this in separate matrix. However I need to perform this operation many times and it takes a lot of time.

I cannot really use (or I dont know how) approaches with reshape since it needs an assumption of equal block size.

You should show the loop version. It's a lot easier to test improvements against a working solution. And less work for us if you provide that working solution. I suspect the best we can do is tweak the solution around the edges. As you say `numpy` solutions that depend on a regular pattern don't apply. — hpaulj, Mar 22 '21 at 15:28

score 1 · Answer 1 · answered Mar 22 '21 at 14:32

If you are open to other packages, Pandas as a convenient groupby function:

out = (pd.Series(a.ravel(), 
                  index = pd.MultiIndex.from_product((pairs,pairs)))
         .groupby(level=(0,1)).mean()
         .unstack().to_numpy()
      )

Output:

array([[5.25      , 5.        , 3.5       ],
       [5.33333333, 3.11111111, 3.33333333],
       [6.5       , 3.33333333, 4.        ]])

score 0 · Answer 2 · answered Mar 22 '21 at 15:09

The best I can imagine is to try to limit the number of loops. I will assume here that the 6x6 2D array is arr and that the communities definition is coms = np.array([0, 0, 1, 1, 1, 2]).

I would first compute slices per community:

dcoms = {k: slice(min(x), 1 + max(x)) for k in np.unique(coms)
         for x in (np.where(coms==k)[0],)}

1 loop over coms

Then I can directly compute the resulting ndarray with 2 loops over dcoms:

resul = np.array([[arr[dcoms[i],dcoms[j]].mean() for j in dcoms] for i in dcoms])

It gives as expected:

array([[5.25      , 5.        , 3.5       ],
       [5.33333333, 3.11111111, 3.33333333],
       [6.5       , 3.33333333, 4.        ]])

score 0 · Answer 3 · answered Mar 23 '21 at 08:51

Thanks guys, I performed some benchmark tests to compare these solutions.

The setup is following

np.random.seed(0)

mat = (np.random.random((500, 500)) - 0.5) * 100

# Create 5 communities of size 50 and 10 communities of size 15
comms = np.concatenate((np.repeat(np.arange(5), 50), np.repeat(np.arange(5, 15), 25)))

My original solution using for loops:

unique_comms = np.unique(comms)
block = np.zeros((len(unique_comms), len(unique_comms)))
for i in unique_comms:
    for j in unique_comms:
        block[i, j] = np.mean(mat[comms == i][:, comms == j])

took 7.66 ms ± 43.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each).

Pandas solution:

out = (pd.Series(mat.ravel(), 
                  index = pd.MultiIndex.from_product((comms, comms)))
         .groupby(level=(0,1)).mean()
         .unstack().to_numpy())

took 22.1 ms ± 221 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

whereas slice solution:

dcoms = {k: slice(min(x), 1 + max(x)) for k in np.unique(comms)
         for x in (np.where(comms==k)[0],)}
resul = np.array([[mat[dcoms[i],dcoms[j]].mean() for j in dcoms] for i in dcoms])

took 2.1 ms ± 61.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each).

Unfortunately Pandas solution is much slower, whereas slice solution is apparently ~3-4 fold faster than regular for loop solution. One drawback of slice solution is that (I believe) it does not work when community vector is shuffled (i.e. np.array([1, 0, 1, 0, 2, 1]) instead of np.array([0, 0, 1, 1, 1, 2]).

How to efficiently calculate block mean (irregular blocks) for numpy 2D array?

3 Answers3