Whats the fastest method for iou in numpy?

Question

I'm currently trying to do an iou analysis for a 3d image in different time points. These images contain around 1500 objects (cells) with an unique id in an around 2000x2000x2000 image.

I found logical_and and logical_or functions of numpy which take one variable at a time from each image so I made a very basic double for loop, to feed every combination of every value into the iou analysis. The code looks like this:

for i in [x for x in image_1_ids if x != 0]:
    for j in [y for y in image_2_ids if y != 0]:
    
        intersection = np.logical_and(image_1 == i, image_2 == j)
        union = np.logical_or(image_1 == i, image_2 == j)

        iou = np.sum(intersection) / np.sum(union)

        df.at[i, j] = iou

This code takes forever to run due to the many variable feed one at a time. Which makes it a combination of basically 1500x1500. Is there a more efficient way in doing this in numpy?

well for starters, how about not pointlessly iterating over `image_2_ids` for as many non-zero elements there are in `image_1_ids`, and also `image_1 == i` is pointlessly repeated many, many times. — juanpa.arrivillaga, Oct 31 '22 at 17:16
Anyway, can you describe what an "iou analysis" is for those of us who do not know? — juanpa.arrivillaga, Oct 31 '22 at 17:19
Do you really need to cartesian product here? You could go with just every pair, which will be the same algorithmic complexity but lower constant factors — juanpa.arrivillaga, Oct 31 '22 at 17:23
Ok so you are working on 3D grids of >=8 GiB and each of the 1500x1500=2_250_000 travel the whole grids at least 4 times... Not surprising this is slow. Numpy is certainly not a good tool for such a huge computation, nor a Python *interpreter* actually (which do not optimize repeated expressions). Besides, the algorithm is the biggest concern. You certainly need a segmentation approach but this is hard to know without information on the code that is not reproducible/working/complete. Please add more information and provide a minimal reproducible example. — Jérôme Richard, Oct 31 '22 at 18:57
@juanpa.arrivillaga thank you for your answer. The ids in image 1 and 2 can be different. Thats the reason why I selected two for loops. Image 1 has all the ids inside of it and I want that numpy only takes the certain id when doing the analysis thats why I selected "image_1 == i". I don't think this is the best method for it. However, currently I couldn't think another method to feed it into these functions. — Donnerhorn, Oct 31 '22 at 22:36
@Donnerhorn you aren't understanding what I'm saying. You **needlessly** repeat iterations, for example, your inner loop does `[y for y in image_2_ids if y != 0]` **for every iteration of the outer loop**, but that only ever needs to happen **once**. Similarly, you do `image_1 == i` (which loops over the whole data) **twice** for ever iteration, i.e. `len(image_1_ids)*len(image_2_ids)` times, when it only needs to happen **once** — juanpa.arrivillaga, Oct 31 '22 at 22:37
So while algorithmically you are still not going to improve much, it should certainly cut down on constant factors to stop doing so much needless, repeated work — juanpa.arrivillaga, Oct 31 '22 at 22:39
@juanpa.arrivillaga iou analysis or also called Jaccard similarity index is a comparison method which takes the ratio of overlapping volume (intersection) and the total volume (union). This enables a method for comparison of two images. It is not the best method by any means. But, the cells move very very slowly between each frame, so it is mostly the same images with very little difference, which makes this method more than enough for this. — Donnerhorn, Oct 31 '22 at 22:39
@juanpa.arrivillaga For the 3rd message: The numpy functions give the 3d image with "True" and "False" markings. So the division is the sum of all the true values. For the last answer: I didn't understood suggestions. How could I improve the repeated work? — Donnerhorn, Oct 31 '22 at 22:42
@JérômeRichard So you say I should do this in 2d with segments? Unfortunately it has to be done in Python. — Donnerhorn, Oct 31 '22 at 22:44
@juanpa.arrivillaga sorry I saw your your second last message just now. I have to compare ever id in image 1 with every id in image 2. The result will be a 1500x1500 matrix with all the iou datas. So I don't really understand where the repeated iteration is? — Donnerhorn, Oct 31 '22 at 22:49
Take a step back. Do you understand that when you do this: `for j in [y for y in image_2_ids if y != 0]:` inside of a loop, it is *reapeated for every iteration*, just like anything inside of a loop? But `[y for y in image_2_ids if y != 0]` **only needs to be computed once**. The same principle applies to `image_1 == i` etc — juanpa.arrivillaga, Oct 31 '22 at 22:57
Actually, I think there is a much bigger issue than performance. If `i != j` then `np.logical_and(image_1 == i, image_2 == j)` will be full of `False` since the sets should be disjoints (one thing cannot be equal to two different things at the same time). Thus, then `np.sum(intersection)` will be 0 and so `iou` too. If `i == j`, then `intersection == union` and so `iou` will be one. In the end, your algorithm just compare `i` and `j` and do not even need to read the 8 GiB `image_1`, `image_2` ! This makes me think **the algorithm is certainly wrong** ! — Jérôme Richard, Nov 01 '22 at 11:34
@JérômeRichard Thank you for your answer. The Ids are given through another software and are somewhat consistent but not equal. The same value does not mean the same object. And since these are cells, cells can divide as well; leading to more objects as time passes. So same id does not mean iou 0. There are many 0's though. If no intersection between the objects exists the intersection would be 0. But we need know this as well, since it will be a matrix with lots of 0s and some values between 0.3-0.9 (From which same object or division can be guessed) — Donnerhorn, Nov 01 '22 at 22:42

score 1 · Accepted Answer · answered Nov 03 '22 at 09:33

A faster algorithm is possible by:

Exploiting (more) the fact that object ids can be used as indices
Calculating the union_ij basically as count_i + count_j - intersection_ij

The algorithm basically works like this:

n_1 = np.max(image_1) + 1
n_2 = np.max(image_2) + 1

counts_1 = np.zeros(n_1, int)
counts_2 = np.zeros(n_2, int)
intersection = np.zeros((n_1, n_2), int)
for i, j in zip(image_1.flat, image_2.flat):
    counts_1[i] += 1
    counts_2[j] += 1
    intersection[i, j] += 1

union = counts_1[:, np.newaxis] + counts_2 - intersection

iou = intersection / union

When we understand how that works it's time to look at some improvements:

Get the counts without a for-loop
Avoid divide-by-zero (in case object ids are not consecutive)
Correct the result where the images contain zero

In code:

counts_1 = np.bincount(image_1.ravel('K'))
counts_2 = np.bincount(image_2.ravel('K'))

shape = (len(counts_1), len(counts_2))

linear_ids = np.ravel_multi_index((image_1, image_2), shape)
intersection = np.bincount(linear_ids.ravel(), minlength=np.product(shape))
intersection = intersection.reshape(shape)

union = counts_1[:, np.newaxis] + counts_2 - intersection

iou = np.zeros(shape, float)
np.divide(intersection, union, out=iou, where=(union != 0))
iou[0, :] = iou[:, 0] = 0

This doesn't take into account image_1_ids and image_2_ids yet. It's a bit unclear what needs to happen here, but with the result from above you can probably do something like this:

iou = iou[image_1_ids[:, np.newaxis], image_2_ids]

Thank you for your answer. This made the code much much faster and I learned valuable information from here as well. However, the ids are important since the matching of the cells will be made in the image as well. So we need the know which ID belongs to which cell. While it did the calculation much much faster, the ids are lost. For example, in the long calculation the structure of the iou is pretty much matching with the ids (So: 1 has the highest score with 1, 2 with 2 etc). There are exceptions of course which are important as well since it can mean for example cell division. — Donnerhorn, Nov 04 '22 at 10:59
Correction: The results are pretty much perfect. The first 2-3 rows/columns do not match, for some reason, with the manual review. But it could find almost everything correctly. The problem with the first rows not matching should be on my side. Thank you very much for your answer :) — Donnerhorn, Nov 04 '22 at 16:37
@Donnerhorn You're welcome :) I guess the code did change a lot, with the image_1_ids handled only in the end. If not entirely correct then hopefully it can still be adapted to your use case. — user7138814, Nov 04 '22 at 21:39
It has very small differences and these were 2 time points from 200 different time points (each time point is one image). So it will take long to calculate anyway, but not infinitely long like before. So, soon we will see if this small error is a deal breaker or not. In any case I think the majority of your code can be used for a correction of this problem, if any is needed. — Donnerhorn, Nov 04 '22 at 23:20

Whats the fastest method for iou in numpy?

1 Answers1