Using np.transpose to make arrays broadcast

Question

I have a broadcasting error like the following:

ValueError: operands could not be broadcast together with shapes (84,36) (84,36,210,45)

Is there a way to get around this? I tried using np.transpose() to shift around the indices so that the broadcasting rules are obeyed but somehow the answer I get is no longer correct. What does np.transpose() really do here?

Do you know the basic `broadcasting` rules? How it adds leading dimensions to match the total number, followed by adjusting all size 1 dimensions to match? What are you trying to produce here? (Not that it probably matters, but what's the operator). (No, `transpose` won't help you). — hpaulj, Dec 05 '20 at 19:51
you can use 2 transpose operations, first to bring the broadcasting dimension to the last 2, as the case with the first array, and then transpose it back. That would be exactly the same as adding new axis to the first array to make it 4D — Akshay Sehgal, Dec 05 '20 at 21:23

Akshay Sehgal · Answer 1 · 2020-12-05T21:32:17.337

There are multiple ways you can do broadcasting.

Using transpose

The longer way is the way you are trying with transpose. Here, since array a has only 2 dimensions (it's last 2 dimensions), you set the first 2 dimensions of the array b as the last 2 dimensions as well -

a = np.random.random((84,36))
b = np.random.random((84,36,210,45))

c = b.transpose(2,3,0,1) + a  #(210, 45, 84, 36) + (84, 36)
c = c.transpose(2,3,0,1)      #transpose back to (84,36,210,45)
c.shape

(84, 36, 210, 45)

Just to clarify here, b.transpose(2,3,0,1) means transpose the 4D array such that the shape now is the 2nd, 3rd, 0th and 1st dimension. Meaning, from (84, 36, 210, 45) -> (210, 45, 84, 36). More clarity here.

Standard broadcasting by adding axes

The standard-way, the more useful one, is to add 2 dimensions to array a. So now, both the arrays share the first 2 dimensions for broadcasting.

c = a[..., None, None] + b #(84,26,1,1) + (84, 36, 210, 45)
c.shape

(84, 36, 210, 45)

Just to clarify here, a[..., None, None] adds 2 new axis and turns the 2D tensor of shape (84, 26) into a 4D tensor of shape (84,26,1,1). More clarity here.

Finally, just to prove that both methods are equivalent you can check like this -

np.all((b.transpose(2,3,0,1) + a).transpose(2,3,0,1) == a[...,None, None] + b)

True

Benchmarks on larger arrays -

Transpose method - 1.88 s ± 977 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Standard broadcasting - 1.25 s ± 156 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

However, an interesting thing I noticed - When the broadcasting dimensions are large, then you get a better speedup with the standard method. But when the non-broadcasting dimensions are larger in size for b, it seems the transpose method is a bit faster than simple broadcasting! I'll analyze this a bit and update my answer but I definitely found something new to learn here :)

Which one is better?

I have faced some situations where due to the nature of the problem, it was necessary to use both methods (e.g in this bounty I needed to use both). I would advise, however, focusing on the standard method as it's far more versatile. I will try to comment on the performance of both in a later edit.

score 0 · Answer 2 · answered Dec 05 '20 at 20:45

In [340]: x = np.ones((3,4));
In [341]: x = np.ones((3,4))
In [342]: y = np.ones((3,4,2,2))

Attempting a broadcastable operation:

In [343]: x+y
Traceback (most recent call last):
  File "<ipython-input-343-259706549f3d>", line 1, in <module>
    x+y
ValueError: operands could not be broadcast together with shapes (3,4) (3,4,2,2)

broadcasting can add dimensions to (3,4) as (1,1,3,4) to make it 4d like (3,4,2,2). But those don't work together.

Instead we need to add dimensions to x:

In [344]: x[:,:,None,None].shape
Out[344]: (3, 4, 1, 1)

Now it works with(3,4,2,2):

In [346]: c=x[:,:,None,None]+y
In [347]: c.shape
Out[347]: (3, 4, 2, 2)

Read the broadcasting docs in my comment.

Using np.transpose to make arrays broadcast

2 Answers2