1

I'm trying to align a rectangular prism contained in a 3D numpy array to the principal axes of the array. Specifically, I need to accomplish this alignment by rotating the full array, as opposed to extracting the coordinates of the object and then rotating these points (this is because a perfect segmentation of the object from the background is unrealistic for my application).

I have a method that works under very specific circumstances, but it seems surprisingly sensitive given how general the approach is. In short, my method only works for very specific object orientations.

I'm looking for guidance on either (a) what's wrong with my approach or (b) a different approach that accomplishes the same objectives.

My approach (inspired by this post and this post):

from scipy.ndimage import affine_transform
from skimage import measure

def find_orientation(prism):
    # Calculate second-order moments
    M = measure.moments_central(prism, order=2)

    # Constructing the covariance matrix from the central moments
    cov_matrix = np.array(
        [
            [M[2, 0, 0], M[1, 1, 0], M[1, 0, 1]],
            [M[1, 1, 0], M[0, 2, 0], M[0, 1, 1]],
            [M[1, 0, 1], M[0, 1, 1], M[0, 0, 2]],
        ]
    )

    # Compute eigenvectors from the covariance matrix
    eigenvalues, eigenvectors = np.linalg.eig(cov_matrix)

    return eigenvectors


def rotate_to_align(prism):
    # Calculate orientation matrix using moments method
    orientation_matrix = find_orientation(prism)

    # Rotate the prism using the orientation matrix
    aligned_prism = affine_transform(prism, orientation_matrix.T, order=3)

    return aligned_prism


aligned_array = rotate_to_align(misaligned_array)

Please see this notebook for the full details (too long for SO): https://github.com/petermattia/3d-alignment/blob/main/prism_alignment.ipynb

Related but distinct questions/resources:

Thank you!

Pete
  • 504
  • 4
  • 15
  • ah, so this appears to be voxels. I doubt you'll get any traction with moments. they cannot tell you about "planes" (sides of the box) in your data. you might have to calculate gradients to get the plane normals of your box. – Christoph Rackwitz Sep 02 '23 at 10:18
  • I think the main issue here is that you use the second order moment tensor as an affine transform matrix. These are two different matrices, you can’t confuse the one for the other. You must find the eigenvectors of the moments matrix, and derive the proper rotation from them. Caveats as indicated by Christoph still apply: the second order moments don’t align with the axes of a cube (if that’s what’s actually in your image). – Cris Luengo Sep 03 '23 at 02:45

1 Answers1

1

In your notebook, you simply rotate around the origin. For severe rotations, that can move your content our of the domain you look at.

In your transformation, introduce translations from and to the center.


Ok so you have a volume, containing a solid box, oriented in some way. You were looking for the axes of that object.

Calculating axes from moments appears to be unstable. The longest axis through a box may be a diagonal. That does not appear to be what you want.

I propose working with the surface normals of each face of the object.

You can get those from clustering the gradients of the volume. That'll give you six stable clusters of surface normals. I've chosen Sobel because the included lowpass is more stable than straight np.gradient().

Pair up opposite vectors. Sort them sensibly to reduce surprising rotations. Then apply some more linear algebra to ensure you actually have vectors that are normal to each other. That need not be the case if the box happens to be a little sheared or just from errors due to finite arithmetic.

This is not perfect. Interpolation/sampling and finite arithmetic are issues.

# imports
import numpy as np
from numpy.linalg import norm, inv
import cv2 as cv
import matplotlib.pyplot as plt
import scipy.ndimage as ndi
# first bunch of utility functions

def normalize(vec):
    vec = np.asarray(vec)
    return vec / norm(vec)

def translate3(tx=0, ty=0, tz=0):
    T = np.eye(4)
    T[:3, 3] = tx, ty, tz
    return T

def rotate3_axis(axis, radians=None, degrees=None):
    axis = np.asarray(axis, dtype=np.float64)

    assert (radians is not None) ^ (degrees is not None)

    if degrees is not None:
        radians = np.deg2rad(degrees)

    if radians is not None:
        axis = normalize(axis) * radians
    
    R, jac = cv.Rodrigues(axis)

    T = np.eye(4)
    T[:3, :3] = R
    return T

def rotate3_R(R):
    T = np.eye(4)
    T[:3, :3] = R
    return T

def rotate3(axis, **kwargs):
    if isinstance(axis, np.ndarray) and axis.ndim == 2:
        return rotate3_R(R=axis)
    else:
        return rotate3_axis(axis, **kwargs)
        

def scale3(s=1, sx=1, sy=1, sz=1):
    return np.diag([s*sx, s*sy, s*sz, 1.0])

axis_names  = ['x',  'y',  'z']
plane_names = ['yz', 'xz', 'xy']

def show(volume, gradient=False):
    vmin = volume.min()
    vmax = volume.max()

    fig, axes = plt.subplots(figsize=(10, 10), nrows=1, ncols=3)

    for k, axis in enumerate(axes):
        axis_slice = np.take(volume, indices=volume.shape[k] // 2, axis=k)
        axis.set_title(f"{axis_names[k]}-axis, {plane_names[k]}-plane")
        axis.set_xlabel(plane_names[k][1])
        axis.set_ylabel(plane_names[k][0])
        axis.imshow(axis_slice, cmap="gray", vmin=vmin, vmax=vmax)
    plt.tight_layout()
    plt.show()
# phantom

vsize = np.array([50, 50, 50])

# box with different sizes in each dimension
box_size = np.array([40, 20, 10], dtype=int)
box_p0 = (vsize - box_size) // 2
box_p1 = box_p0 + box_size

volume = np.zeros(vsize, dtype=np.float32)

(bx0, by0, bz0), (bx1, by1, bz1) = box_p0, box_p1
volume[bx0:bx1, by0:by1, bz0:bz1] = 1

phantom

Tt = translate3(*(vsize/2))
Tr = rotate3(axis=(0,0,1), degrees=30)
T = Tt @ Tr @ inv(Tt)

warped = ndi.affine_transform(
    input=volume,
    matrix=T,
    # offset ignored
    output_shape=vsize,
    order=3,
    mode='nearest',
)

warped

# per-voxel gradient of the warped volume

def gradients(vol):
    gradx = np.gradient(vol, edge_order=2, axis=0)
    grady = np.gradient(vol, edge_order=2, axis=1)
    gradz = np.gradient(vol, edge_order=2, axis=2)
    return np.stack([gradx, grady, gradz], axis=-1)

def gradients_sobel(vol):
    gradx = ndi.sobel(vol, axis=0)
    grady = ndi.sobel(vol, axis=1)
    gradz = ndi.sobel(vol, axis=2)
    return np.stack([gradx, grady, gradz], axis=-1)

grad = gradients_sobel(warped)
mag = norm(grad, axis=-1)

mag

# cluster the gradient vectors
# seven clusters
# one per face, and background/zero

gradvectors = grad.reshape((-1, 3))

from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=7, random_state=0, n_init=10).fit(gradvectors)
labels = kmeans.labels_
clusters = kmeans.cluster_centers_

print("clusters:")
print(clusters)

(hist, edges) = np.histogram(labels, bins=np.arange(kmeans.n_clusters+1))
print("counts:", hist)

#bg_label = np.argmax(hist) # most common
bg_label = np.argmin(norm(clusters, axis=1)) # closest to zero
print("background label:", bg_label)

face_vecs = np.delete(clusters, bg_label, axis=0)
#face_vecs /= norm(face_vecs, axis=1, keepdims=True)
print("foreground clusters:")
print(face_vecs)
clusters:
[[ -0.0007   -0.0002    0.     ]
 [ -0.0043    0.00083 -15.45364]
 [ -0.0043    0.00083  15.45364]
 [  6.14112  11.05985   0.     ]
 [ -6.13702 -11.10985   0.     ]
 [ 10.91968  -6.22587   0.     ]
 [-11.013     6.04535   0.     ]]
counts: [119556   1547   1547    792    778    392    388]
background label: 0
foreground clusters:
[[ -0.0043    0.00083 -15.45364]
 [ -0.0043    0.00083  15.45364]
 [  6.14112  11.05985   0.     ]
 [ -6.13702 -11.10985   0.     ]
 [ 10.91968  -6.22587   0.     ]
 [-11.013     6.04535   0.     ]]
# it should be three pairs of face vectors, where each pair is opposite faces
# check which vectors are a pair by checking the dot product

def find_opposite_faces(face_vecs):
    pairs = []

    for i, vec1 in enumerate(face_vecs):
        for j, vec2 in enumerate(face_vecs):
            if i >= j: continue
            dist = np.dot(normalize(vec1), normalize(vec2))
            if dist < -0.999:
                pairs.append((i, j))
                print(f"({i}, {j}) dist {dist}")

    return pairs

pairs = find_opposite_faces(face_vecs)
print(pairs)
(0, 1) dist -0.9999999403953552
(2, 3) dist -0.9999975562095642
(4, 5) dist -0.9998694658279419
[(0, 1), (2, 3), (4, 5)]
# pick one vector per pair which should be pointing positively (dot with [1,1,1] is positive)

axis_vecs = np.array([
    face_vecs[i1] if np.dot(face_vecs[i1], [1,1,1]) >= 0 else face_vecs[i2]
    for (i1, i2) in pairs
])

print(axis_vecs)
[[-0.0043   0.00083 15.45364]
 [ 6.14112 11.05985  0.     ]
 [10.91968 -6.22587  0.     ]]
# vectors are normal
# build basis from the three vectors
# for least amount of "accidental" rotation, find closest to X, Y, Z axes
# 1. find the one closest to X axis, remove
# 2. find the one closest to Y axis, remove
# 3. remaining one taken for Z axis

def find_closest(vecs, target):
    dists = np.dot(target, vecs.T)
    return np.argmax(dists)

remaining = [0, 1, 2]

index = find_closest(axis_vecs[remaining], [1,0,0])
x_axis = axis_vecs[remaining[index]]
remaining.pop(index)

index = find_closest(axis_vecs[remaining], [0,1,0])
y_axis = axis_vecs[remaining[index]]
remaining.pop(index)

z_axis = axis_vecs[remaining[0]]

basis = np.array([x_axis, y_axis, z_axis]).T
print(basis)
[[10.91968  6.14112 -0.0043 ]
 [-6.22587 11.05985  0.00083]
 [ 0.       0.      15.45364]]
# how close is this to the 30 degrees applied originally?
(rvec, jac) = cv.Rodrigues(basis)
print(np.rad2deg(norm(rvec)))
29.3647
# this basis may not be orthogonal, so we need to adjust it

# Modified Gram-Schmidt process, for numerical stability
# https://en.wikipedia.org/wiki/Gram%E2%80%93Schmidt_process

# does not optimize all vectors together. maintains first vector.

def modified_gram_schmidt(basis):
    """
    basis: NxN matrix of column vectors
    """
    basis = np.asarray(basis, dtype=np.float64)
    basis = basis.copy()
    M,N = basis.shape

    for i in range(N):
        basis[:,i] /= norm(basis[:,i])

        for j in range(i+1, N):
            basis[:,j] -= np.dot(basis[:,j], basis[:,i]) * basis[:,i]

    return basis

mgs_basis = modified_gram_schmidt(basis)
print(mgs_basis)
[[ 0.86872  0.4953  -0.     ]
 [-0.4953   0.86872  0.     ]
 [ 0.       0.       1.     ]]
# how close is this to the 30 degrees applied originally?
(rvec, jac) = cv.Rodrigues(mgs_basis)
print(np.rad2deg(norm(rvec)))
29.689681597833314
# now rotate the volume to align with the basis

Tt = translate3(*(vsize/2))
Tr = rotate3(mgs_basis)

T_unwarp = Tt @ Tr @ inv(Tt)

unwarped = ndi.affine_transform(
    input=warped,
    matrix=T_unwarp,
    # offset ignored
    output_shape=vsize,
    order=3,
    mode='nearest',
)

unwarped

The clustering of gradients can be handled in various ways. One could attempt to discard outliers and recalculate the centroids. One could attempt to normalize the vectors, once background gradient (zero) was identified and discarded.

One could even perform (part of) Marching Cubes and fit planes! I think that's enough for tonight.

Christoph Rackwitz
  • 11,317
  • 4
  • 27
  • 36