We have 3D segmentation masks where every class has its own label / ID. For every class we would like to fill holes in the segmentation.
For an example, the following matrix:
[
[
[ 1, 1, 1, 2, 2, 2 ],
[ 1, 1, 1, 2, 2, 2 ],
[ 1, 1, 1, 2, 2, 2 ],
[ 0, 3, 0, 0, 4, 0 ],
[ 3, 3, 3, 4, 0, 4 ],
[ 0, 3, 0, 0, 4, 0 ],
],
[
[ 1, 1, 1, 2, 2, 2 ],
[ 1, 0, 1, 2, 0, 0 ],
[ 1, 1, 1, 2, 2, 2 ],
[ 0, 3, 0, 0, 4, 0 ],
[ 3, 0, 3, 4, 0, 4 ],
[ 0, 3, 0, 0, 4, 0 ],
],
[
[ 1, 1, 1, 2, 2, 2 ],
[ 1, 1, 1, 2, 2, 2 ],
[ 1, 1, 1, 2, 2, 2 ],
[ 0, 3, 0, 0, 4, 0 ],
[ 3, 3, 3, 4, 4, 4 ],
[ 0, 3, 0, 0, 4, 0 ],
],
]
Should result in
[
[
[ 1, 1, 1, 2, 2, 2 ],
[ 1, 1, 1, 2, 2, 2 ],
[ 1, 1, 1, 2, 2, 2 ],
[ 0, 3, 0, 0, 4, 0 ],
[ 3, 3, 3, 4, 0, 4 ],
[ 0, 3, 0, 0, 4, 0 ],
],
[
[ 1, 1, 1, 2, 2, 2 ],
[ 1, 1, 1, 2, 0, 0 ],
[ 1, 1, 1, 2, 2, 2 ],
[ 0, 3, 0, 0, 4, 0 ],
[ 3, 3, 3, 4, 0, 4 ],
[ 0, 3, 0, 0, 4, 0 ],
],
[
[ 1, 1, 1, 2, 2, 2 ],
[ 1, 1, 1, 2, 2, 2 ],
[ 1, 1, 1, 2, 2, 2 ],
[ 0, 3, 0, 0, 4, 0 ],
[ 3, 3, 3, 4, 4, 4 ],
[ 0, 3, 0, 0, 4, 0 ],
],
]
The only filled holes are the 1 and 3 in the middle slice. The 2 shape is open to the side and the 4 is open to the back. The 0 between the classes should stay untouched.
I implemented 7 versions using the existing scipy.ndimage.morphology.binary_fill_holes function (or its implementation) and numpy
. Here the two best versions so far:
import numpy as np
from scipy.ndimage.morphology import binary_fill_holes, label, generate_binary_structure, binary_dilation
def fill_holes6(img: np.ndarray, applied_labels: np.ndarray) -> np.ndarray:
output = np.zeros_like(img)
for i in applied_labels:
output[binary_fill_holes(img == i)] = i
return output
def fill_holes7(img: np.ndarray, applied_labels: np.ndarray) -> np.ndarray:
output = np.zeros(img.shape, dtype=int)
for i in applied_labels:
tmp = np.zeros(img.shape, dtype=bool)
binary_dilation(tmp, structure=None, iterations=-1, mask=img != i, origin=0, border_value=1, output=tmp)
output[np.logical_not(tmp)] = i
return output
# EDIT: Added the following method:
def fill_holes8(img: np.ndarray, applied_labels: np.ndarray) -> np.ndarray:
connectivity = 1
footprint = generate_binary_structure(img.ndim, connectivity)
background_mask = img == 0
components, num_components = label(background_mask, structure=footprint)
filled_holes = np.zeros_like(img)
for component_label in range(1, num_components + 1):
component_mask = components == component_label
component_neighborhood = np.pad(img, 1, constant_values=-1)[binary_dilation(np.pad(component_mask, 1), structure=footprint)]
neighbor_labels = np.unique(component_neighborhood)
if len(neighbor_labels) == 2 and -1 not in neighbor_labels:
neighbor_label = neighbor_labels[1]
filled_holes[component_mask] = neighbor_label
return img + filled_holes
I measured the performance the following way (matching my real world data distribution):
import time
import pandas as pd
def measure(funs, t):
res = []
for _ in range(t):
ra = np.random.randint(10, 40)
sh = np.random.randint(200, 400, 3)
img = np.random.randint(0, ra, sh)
applied_labels = np.unique(img)[1:]
fun_res = []
for fun in funs:
start = time.time()
fun(img, applied_labels)
end = time.time()
fun_res.append(end - start)
res.append(fun_res)
return np.min(res, axis=0), np.max(res, axis=0), np.mean(res, axis=0), np.std(res, axis=0)
print(measure([fill_holes6, fill_holes7], t=10))
For my first implementations I got the following execution times (t=100):
fill_holes1 | fill_holes2 | fill_holes3 | |
---|---|---|---|
min | 6.4s | 6.9s | 6.2s |
max | 83.7s | 96.0s | 80.4s |
mean | 32.9s | 37.3s | 31.6s |
std | 17.3s | 20.1s | 16.5 |
This is very slow. The last implementation fill_holes7 is only 1.27 times faster than fill_holes3.
Is there a more performant way of doing this?
Opened a feature request on the scipy project first but was asked to go to stackoverflow first: https://github.com/scipy/scipy/issues/14504
EDIT:
I also opened a feature request on the MONAI project. See #2678
For this I opened a pull request with the iterative erosion solution (fill_holes7
).
You can find the documentation here: monai.transforms.FillHoles
During this I also implemented a connected component labeling (CCL) based version.
See the implementation in MONAI here.
I added fill_holes8
above, which is basically that implementation.
The MONAI package is happy for any pull request improving the performance of this method. Feel free to go there, open an issue and a pull request.