How to select numpy tensordot axes

Question

I have two numpy arrays of shape (436, 1024, 2). The last dimension (2) represents 2D vectors. I want to compare the 2D vector of the two numpy arrays element-wise in order to find the average angular error.

To do so I want to use the dot product which works perfectly fine when looping over the two first dimensions of the arrays (for loop in python can be slow). Therefore I would like to use a numpy function.

I found that np.tensordot allows to perform dot product element-wise. However, I do not succeed to use its axes argument:

import numpy as np

def average_angular_error_vec(estimated_oc : np.array, target_oc : np.array):
    estimated_oc = np.float64(estimated_oc)
    target_oc = np.float64(target_oc)

    norm1 = np.linalg.norm(estimated_oc, axis=2)
    norm2 = np.linalg.norm(target_oc, axis=2)
    norm1 = norm1[..., np.newaxis]
    norm2 = norm2[..., np.newaxis]

    unit_vector1 = np.divide(estimated_oc, norm1)
    unit_vector2 = np.divide(target_oc, norm2)

    dot_product = np.tensordot(unit_vector1, unit_vector2, axes=2)
    angle = np.arccos(dot_product)

    return np.mean(angle)

I have the following error :

ValueError: shape-mismatch for sum

Below is my function which computes the average angular error correctly:

def average_angular_error(estimated_oc : np.array, target_oc : np.array):
    h, w, c = target_oc.shape
    r = np.zeros((h, w), dtype="float64")

    estimated_oc = np.float64(estimated_oc)
    target_oc = np.float64(target_oc)

    for i in range(h):
        for j in range(w):

            unit_vector_1 = estimated_oc[i][j] / np.linalg.norm(estimated_oc[i][j])
            unit_vector_2 = target_oc[i][j] / np.linalg.norm(target_oc[i][j])
            dot_product = np.dot(unit_vector_1, unit_vector_2)

            angle = np.arccos(dot_product)

            r[i][j] = angle
       
    return np.mean(r)

[Possible duplicate](https://stackoverflow.com/questions/41870228/understanding-tensordot/41871402) — JavaiMaster, Oct 09 '20 at 16:09
By the way, the class is `np.ndarray`. `np.array` is a static constructor. If you want to use duck-typing, stick `x = np.asanyarray(x)` or so into the start of your functions. — Mad Physicist, Oct 09 '20 at 17:46
@JavaiMaster. The linked question is not a duplicate. It acutally explains why `tensordot` is definitely *not* the answer here. — Mad Physicist, Oct 09 '20 at 18:02
@JavaiMaster. That being said, I feel like I've answered something like this before. I'll try to find the dupe now that I've answered it :) — Mad Physicist, Oct 09 '20 at 18:03
@AlixL Where are you getting the error exactly? It looks more like a shape mismatch in the input than anything. — Mad Physicist, Oct 09 '20 at 18:05

Mad Physicist · Accepted Answer · 2020-10-09T23:00:56.840

The problem is likely much simpler than you are making it. If you apply np.tensordot to a pair of arrays of shape (w, h, 2) along the last axis, you will get a result of shape (w, h, w, h). This is not what you want. There are three simple approaches here. In addition to showing the options, I've shown a few tips and tricks for making the code simpler without changing any of the basic functionality:

Do the sum-reduction manually (using + and *):

def average_angular_error(estimated_oc : np.ndarray, target_oc : np.ndarray):
    # If you want to do in-place normalization, do x /= ... instead of x = x / ...
    estimated_oc = estimated_oc / np.linalg.norm(estimated_oc, axis=-1, keepdims=True)
    target_oc = target_oc / np.linalg.norm(target_oc, axis=-1, keepdims=True)
    # Use plain element-wise multiplication
    dots = np.sum(estimated_oc * target_oc, axis=-1)
    return np.arccos(dots).mean()

Use np.matmul (a.k.a. @) with properly broadcasted dimensions:
```
def average_angular_error(estimated_oc : np.ndarray, target_oc : np.ndarray):
    estimated_oc = estimated_oc / np.linalg.norm(estimated_oc, axis=-1, keepdims=True)
    target_oc = target_oc / np.linalg.norm(target_oc, axis=-1, keepdims=True)
    # Matrix multiplication needs two dimensions to operate on
    dots = estimated_oc[..., None, :] @ target_oc[..., :, None]
    return np.arccos(dots).mean()
```
np.matmul and np.dot both require the last dimension of the first array to match the second to last of the second, like with normal matrix multiplication. None is an alias for np.newaxis, which introduces a new axis of size 1 at the location of your choice. In this case, I made the first array (w, h, 1, 2) and the second (w, h, 2, 1). That ensures that the last two dimensions are multiplied as a transposed vector and a regular vector at every corresponding element.

Use np.einsum:

def average_angular_error(estimated_oc : np.ndarray, target_oc : np.ndarray):
    estimated_oc = estimated_oc / np.linalg.norm(estimated_oc, axis=-1, keepdims=True)
    target_oc = target_oc / np.linalg.norm(target_oc, axis=-1, keepdims=True)
    # Matrix multiplication needs two dimensions to operate on
    dots = np.einsum('ijk,ijk->ik', estimated_oc, target_oc)
    return np.arccos(dots).mean()

You can't use np.dot or np.tensordot for this. dot and tensordot keep the untouched dimensions of both arrays, as explained earlier. matmul broadcasts them together, which is what you want.

Indeed other options work very well and I found my way using np.einsum. However I still wonder how one could use np.tensordot to obtain the same result. Would you mind explaining the np.matmul option. I am not sure to understand what [..., None, :] means. Are you adding a dimension ? If yes, why using a column after ? — AlixL, Oct 09 '20 at 22:36
@AlixL. I've fleshed out #2 and added a little postscript to address your concerns. — Mad Physicist, Oct 09 '20 at 23:01

How to select numpy tensordot axes

1 Answers1