-1

I have a 3D medical image. I use scipy.ndimage.measurements.label to get the connected voxel groups. It is very fast. But I also want to get the number of voxels of each label. I use the following code to get the number of each value (suppose image_3d is the array after scipy.ndimage.measurements.label). It cost about 2 minutes.

import numpy as np
from skimage.measure import label
import time

image_3d = np.random.randint(100, size=(512, 512, 1024))
t1 = time.time()
pixel_count_list = [np.sum((image_3d== i).astype(int)) for i in range(100)]
t2 = time.time()
print("used time: ", t2-t1)
# 117 seconds

Is there any efficient way to get it efficiently?

double-beep
  • 5,031
  • 17
  • 33
  • 41
Jingnan Jia
  • 1,108
  • 2
  • 12
  • 28
  • 1
    `np.unique` with `return_counts=True` – yatu Oct 10 '20 at 18:10
  • @yatu Excellent!!! I did not know `return_counts` parameter before! – Jingnan Jia Oct 10 '20 at 18:14
  • 1
    I'm glad you found a solution to your problem. However, an actual answer/solution should **not** be edited into your question. In general, you should [edit] the question to *clarify* it, but not to include an answer within it. You should create your own answer with the code/solution you used to solve your problem, and then accept it (the system may require a 48 hour delay prior to doing so). When you've solved the problem yourself, [answering your own question is encouraged](/help/self-answer). – double-beep Oct 10 '20 at 18:20
  • Bincount is O(n) while unique sorts, and is therefore O(n log n) – Mad Physicist Oct 11 '20 at 06:07
  • Research before asking is also encouraged. Also, you have a bunch of superfluous imports. – Mad Physicist Oct 11 '20 at 06:08

1 Answers1

0
import numpy as np
from skimage.measure import label
import time

image_3d = np.random.randint(100, size=(512, 512, 1024))
t1 = time.time()

unique_label, count_list = np.unique(image_3d, return_counts=True)
t2 = time.time()
print("used time: ", t2-t1)
# 2 seconds


Jingnan Jia
  • 1,108
  • 2
  • 12
  • 28