35

I am learning NumPy and I am not really sure what is the operator * actually doing. It seems like some form of multiplication, but I am not sure how is it determined. From ipython:

In [1]: import numpy as np

In [2]: a=np.array([[1,2,3]])

In [3]: b=np.array([[4],[5],[6]])

In [4]: a*b
Out[4]: 
array([[ 4,  8, 12],
       [ 5, 10, 15],
       [ 6, 12, 18]])

In [5]: b*a
Out[5]: 
array([[ 4,  8, 12],
       [ 5, 10, 15],
       [ 6, 12, 18]])

In [6]: b.dot(a)
Out[6]: 
array([[ 4,  8, 12],
       [ 5, 10, 15],
       [ 6, 12, 18]])

In [7]: a.dot(b)
Out[7]: array([[32]])

It seems like it is doing matrix multiplication, but only b multiplied by a, not the other way around. What is going on?

Karel Bílek
  • 36,467
  • 31
  • 94
  • 149

2 Answers2

32

It's a little bit complicated and has to do with the concept of broadcasting and the fact that all numpy operations are element wise.

  1. a is a 2D array with 1 row and 3 columns and b is a 2D array with 1 column and 3 rows.
  2. If you try to multiply them element by element (which is what numpy tries to do if you do a * b because every basic operation except the dot operation is element wise), it must broadcast the arrays so that they match in all their dimensions.
  3. Since the first array is 1x3 and the second is 3x1 they can be broadcasted to 3x3 matrix according to the broadcasting rules. They will look like:
a = [[1, 2, 3],
     [1, 2, 3],
     [1, 2, 3]]

b = [[4, 4, 4],
     [5, 5, 5],
     [6, 6, 6]]

And now Numpy can multiply them element by element, giving you the result:

[[ 4,  8, 12],
 [ 5, 10, 15],
 [ 6, 12, 18]]

When you are doing a .dot operation it does the standard matrix multiplication. More in docs

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Viktor Kerkez
  • 45,070
  • 12
  • 104
  • 85
10

* does elementwise multiplication.

Since the arrays are of different shapes, broadcasting rules will be applied.

In [5]: a.shape
Out[5]: (1, 3)

In [6]: b.shape
Out[6]: (3, 1)

In [7]: (a * b).shape
Out[7]: (3, 3)
  1. All input arrays with ndim smaller than the input array of largest ndim, have 1’s prepended to their shapes (does not apply here).
  2. The size in each dimension of the output shape is the maximum of all the input sizes in that dimension.
  3. An input can be used in the calculation if its size in a particular dimension either matches the output size in that dimension, or has value exactly 1.
  4. If an input has a dimension size of 1 in its shape, the first data entry in that dimension will be used for all calculations along that dimension. In other words, the stepping machinery of the ufunc will simply not step along that dimension (the stride will be 0 for that dimension).

So, the resulting shape must be (3, 3) (maximums of a and b dimension sizes) and while performing the multiplication numpy will not step through a's first dimension and b's second dimension (their sizes are 1).

The result's [i][j] element is equal to the product of broadcasted a's and b's [i][j] element.

(a * b)[0][0] == a[0][0] * b[0][0]
(a * b)[0][1] == a[0][1] * b[0][0]  # (not stepping through b's second dimension)
(a * b)[0][2] == a[0][2] * b[0][0]
(a * b)[1][0] == a[0][0] * b[1][0]  # (not stepping through a's first dimension)

etc.
Pavel Anossov
  • 60,842
  • 14
  • 151
  • 124
  • Oh. That broadcasting thing is very confusing. – Karel Bílek Aug 17 '13 at 22:43
  • 2
    It is at first, but really it boils down to this: If one array has less dimensions than the other, throw another pair of brackets around it until the number of dimensions is equal. If there is only one element in any dimension, use it for all indices in that dimension. If there is more than one element and there is a different number of elements in that dimension in the other array, nothing can be done and an error is thrown. – Pavel Anossov Aug 17 '13 at 22:49
  • Hehe, any good idea how to write 2. correctly? It can be 0... – seberg Aug 17 '13 at 23:18