What is a fast(er) way to get the center points of objects represented in a 2D numpy array?

Question

I have an image mask stored as a 2D numpy array where the values indicate the presence of objects that have been segmented in the image (0 = no object, 1..n = object 1 through n). I want to get a single coordinate for each object representing the center of the object. It doesn't have to be a perfectly accurate centroid or center of gravity. I'm just taking the mean of the x and y indices of all cells in the array that contain each object. I'm wondering if there's a faster way to do this than my current method:

for obj in np.unique(mask):
    if obj == 0:
        continue
    x, y = np.mean(np.where(mask == obj), axis=1)

Here is a reproducible example:

import numpy as np
mask = np.array([
    [0,0,0,0,0,2,0,0,0,0],
    [0,1,1,0,2,2,2,0,0,0],
    [0,0,1,0,2,2,2,0,0,0],
    [0,0,0,0,0,0,0,0,0,0],
    [0,3,3,3,0,0,4,0,0,0],
    [0,0,0,0,0,4,4,4,0,0],
    [0,0,0,0,0,0,4,0,0,0],
])

points = []
for obj in np.unique(mask):
    if obj == 0:
        continue
    points.append(np.mean(np.where(mask == obj), axis=1))
print(points)

This outputs:

[array([1.33333333, 1.66666667]),
 array([1.28571429, 5.        ]),
 array([4., 2.]),
 array([5., 6.])]

What does that have to do with whether or not there's a faster way to do it? — Colin, Nov 17 '21 at 23:50
Note that `x` and `y` seems inverted. But I think conventions change regarding the country. Here `y` is the value related to the last dimension here (generally the most contiguous one). — Jérôme Richard, Nov 18 '21 at 00:07

score 0 · Answer 1 · answered Nov 18 '21 at 00:33

I came up with another way to do it that seems to be about 3x faster:

import numpy as np
mask = np.array([
    [0,0,0,0,0,2,0,0,0,0],
    [0,1,1,0,2,2,2,0,0,0],
    [0,0,1,0,2,2,2,0,0,0],
    [0,0,0,0,0,0,0,0,0,0],
    [0,3,3,3,0,0,4,0,0,0],
    [0,0,0,0,0,4,4,4,0,0],
    [0,0,0,0,0,0,4,0,0,0],
])

flat = mask.flatten()
split = np.unique(np.sort(flat), return_index=True)[1]
points = []
for inds in np.split(flat.argsort(), split)[2:]:
    points.append(np.array(np.unravel_index(inds, mask.shape)).mean(axis=1))
print(points)

I wonder if the for loop can be replaced with a numpy operation which would likely be even faster.

score 0 · Answer 2 · answered Nov 18 '21 at 00:36

You can copy this answer (give them an upvote too if this answer works for you) and use sparse matrices instead of np arrays. However, this only proves to be quicker for large arrays, with increasing speed boosts the larger your array is:

import numpy as np, time
from scipy.sparse import csr_matrix

def compute_M(data):
    cols = np.arange(data.size)
    return csr_matrix((cols, (np.ravel(data), cols)),
                      shape=(data.max() + 1, data.size))

def get_indices_sparse(data,M):
    #M = compute_M(data)
    return [np.mean(np.unravel_index(row.data, data.shape),1) for R,row in enumerate(M) if R>0]

def gen_random_mask(C, n, m):
    mask = np.zeros([n,m],int)
    for i in range(C):
        x = np.random.randint(n)
        y = np.random.randint(m)
        mask[x:x+np.random.randint(n-x),y:y+np.random.randint(m-y)] = i
    return mask

N = 100
C = 4
for S in [10,100,1000,10000]:
    mask = gen_random_mask(C, S, S)
    print('Time for size {:d}x{:d}:'.format(S,S))
    s = time.time()
    for _ in range(N):
        points = []
        for obj in np.unique(mask):
            if obj == 0:
                continue
            points.append(np.mean(np.where(mask == obj), axis=1))
    points_np = np.array(points)
    print('NP: {:f}'.format((time.time() - s)/N))
    mask_s = compute_M(mask)
    s = time.time()
    for _ in range(100):
        points = get_indices_sparse(mask,mask_s)
    print('Sparse: {:f}'.format((time.time() - s)/N))
    np.testing.assert_equal(points,points_np)

Which results in the timings of:

Time for size 10x10:
NP: 0.000066
Sparse: 0.000226
Time for size 100x100:
NP: 0.000207
Sparse: 0.000253
Time for size 1000x1000:
NP: 0.018662
Sparse: 0.004472
Time for size 10000x10000:
NP: 2.545973
Sparse: 0.501061

Jérôme Richard · Answer 3 · 2021-11-18T00:54:07.353

The problem likely comes from np.where(mask == obj) which iterates on the whole mask array over and over. This is a problem when there are a lot of objects. You can solve this problem efficiently using a group-by strategy. However, Numpy do not yet provide such an operation. You can implement that using a sort followed by a split. But a sort is generally not efficient. An alternative method is to ask Numpy to return the index in the unique call so that you can then accumulate the value regarding the object (like a reduce-by-key where the reduction operator is an addition and the key are object integers). The mean can be obtained using a simple division in the end.

objects, inverts, counts = np.unique(mask, return_counts=True, return_inverse=True)

# Reduction by object
x = np.full(len(objects), 0.0)
y = np.full(len(objects), 0.0)
xPos = np.repeat(np.arange(mask.shape[0]), mask.shape[1])
yPos = np.tile(np.arange(mask.shape[1]), reps=mask.shape[0])
np.add.at(x, inverts, xPos)
np.add.at(y, inverts, yPos)

# Compute the final mean from the sum
x /= counts
y /= counts

# Discard the first item (when obj == 0)
x = x[1:]
y = y[1:]

If you need something faster, you could use Numba and perform the reduction manually (and possibly in parallel).

EDIT: if you really need a list in output, you can use points = list(np.stack([x, y]).T) but this is rather slow to use lists instead of Numpy arrays (and not memory efficient either).

score 0 · Answer 4 · answered Oct 11 '22 at 22:10

Because the mask values number the segments they can be directly used as indices into numpy arrays. Combined with Cython this can be used to achieve a strong speed-up.

In Jupyter start with loading Cython:

%load_ext Cython

then use Python magic and a single pass over the whole array to calculate the means:

%%cython -a
import cython
import numpy as np
cimport numpy as np

@cython.boundscheck(False) # turn off bounds-checking for entire function
@cython.wraparound(False)  # turn off negative index wrapping for entire function
def calc_xy_mean4(int[:,:] mask, int number_of_maskvalues):
    
    cdef int[:] sum_x = np.zeros(number_of_maskvalues, dtype='int')
    cdef int[:] sum_y = np.zeros(number_of_maskvalues, dtype='int')
    n = np.zeros(number_of_maskvalues, dtype='int')
    cdef int[:] n_mv = n
    mean_x = np.zeros(number_of_maskvalues, dtype='float')
    mean_y = np.zeros(number_of_maskvalues, dtype='float')
    cdef double[:] mean_x_mv = mean_x
    cdef double[:] mean_y_mv = mean_y
    
    cdef int x_max = mask.shape[0] 
    cdef int y_max = mask.shape[1]
    
    cdef int segment_index
    cdef int x
    cdef int y
    
    for x in range(x_max):
        for y in range(y_max):
            segment_index = mask[x,y]
            n_mv[segment_index] +=  1
            sum_x[segment_index] += x
            sum_y[segment_index] += y
    
    for segment_index in range(number_of_maskvalues):
        mean_x_mv[segment_index] = sum_x[segment_index]/n[segment_index]
        mean_y_mv[segment_index] = sum_y[segment_index]/n[segment_index]
        
    return mean_x, mean_y, n

and call it with timeit magic


mask = np.array([
    [0,0,0,0,0,2,0,0,0,0],
    [0,1,1,0,2,2,2,0,0,0],
    [0,0,1,0,2,2,2,0,0,0],
    [0,0,0,0,0,0,0,0,0,0],
    [0,3,3,3,0,0,4,0,0,0],
    [0,0,0,0,0,4,4,4,0,0],
    [0,0,0,0,0,0,4,0,0,0],
])

%timeit calc_xy_mean4(mask, 5)

This Cython solution is on my machine 9 times faster than the original code.

6.32 µs ± 18.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

and if we run the same instruction without the timeit magic:

calc_xy_mean4(mask, 5)

we obtain as output:

(array([3.07692308, 1.33333333, 1.28571429, 4.        , 5.        ]),
 array([4.59615385, 1.66666667, 5.        , 2.        , 6.        ]),
 array([52,  3,  7,  3,  5]))

What is a fast(er) way to get the center points of objects represented in a 2D numpy array?

4 Answers4

Linked