understand numpy.dot with N-d array and 1-d array

Question

To who voted to close because of unclear what I'm asking, here are the questions in my post:

Can anyone tell me what's the result of y?
Is there anything called sum product in Mathematics?
Is x subject to broadcasting?
Why is y a column/row vector?
What if x=np.array([[7],[2],[3]])?

w=np.array([[1,2,3],[4,5,6],[7,8,9]])
x=np.array([7,2,3])
y=np.dot(w,x)

Can anyone tell me what's the result of y?

I deliberately Mosaic the screenshot so that you pretend you are in a test and cannot run python to get the result.

https://docs.scipy.org/doc/numpy-1.15.4/reference/generated/numpy.dot.html#numpy.dot says

If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.

Is there anything called sum product in Mathematics?

Is x subject to broadcasting?

Why is y a column/row vector?

What if x=np.array([[7],[2],[3]])?

some of the products. If it helps think of `x` as (3,1) column matrix and dio the high school matrix product. Or imagine an inner product of each row of `w` with `x`. — hpaulj, Jan 26 '19 at 08:10
1) Running that code gives the result of `y` immediately. 2) Googling "what is sum product" gives answer to that immediately. Questions 3 and 4 I don't get. Question 5 can be answered as easy as 1. — zvone, Jan 26 '19 at 09:09
oops - I should have written `sum of the products`. The docs do say that for 2 1d arrays the result is the inner product.Broadcasting rules don't apply in `dot` — hpaulj, Jan 26 '19 at 16:47

score 1 · Answer 1 · answered Jan 26 '19 at 08:23

np.dot is nothing but matrix multiplication if the dimensions match for multiplication (i.e. w is 3x3 and x is 1x3, so matrix multiplication of WX cannot be made but XW is okay). In first case:

>>> w=np.array([[1,2,3],[4,5,6],[7,8,9]])
>>> x=np.array([7,2,3])
>>> w.shape
(3, 3)
>>> x.shape # 1d vector
(3, )

So in this case it returns the inner product of each row of W with X:

>>> [np.dot(ww,x) for ww in w]
[20, 56, 92]
>>> np.dot(w,x)
array([20, 56, 92]) # as they are both same

change the order

>>> y = np.dot(x,w) # matrix mult as usual
>>> y
array([36, 48, 60])

In second case:

>>> x=np.array([[7],[2],[3]])
>>> x.shape
(3, 1)
>>> y = np.dot(w,x) # matrix mult
>>> y
array([[20],
       [56],
       [92]])

However, this time dimensions does not match for both multiplication (3x1,3x3) and inner product (1x1,1x3) so it raises error.

>>> y = np.dot(x,w)
Traceback (most recent call last):

  File "<ipython-input-110-dcddcf3bedd8>", line 1, in <module>
    y = np.dot(x,w)

ValueError: shapes (3,1) and (3,3) not aligned: 1 (dim 1) != 3 (dim 0)

hpaulj · Answer 2 · 2019-01-26T21:52:20.573

I don't think your question is unclear, but rather overly pedantic.

For example, why are you puzzled by sum product in this nD by 1d case, when the docs use inner product for the 1d by 1d case, and matrix product in the 2d by 2d case? Give yourself some freedom to read it as sum of the products, as done in the inner product.

To make your example clearer, make w rectangular, to better distinguish row actions from column ones:

In [168]: w=np.array([[1,2,3],[4,5,6]])
     ...: x=np.array([7,2,3])
     ...: 
     ...: 
In [169]: w.shape
Out[169]: (2, 3)
In [170]: x.shape
Out[170]: (3,)

The dot and its equivalent einstein notation:

In [171]: np.dot(w,x)
Out[171]: array([20, 56])
In [172]: np.einsum('ij,j->i',w,x)
Out[172]: array([20, 56])

The sum of the products is being done on the repeated j dimension, without summation on i.

We can do the same thing with broadcasted elementwise multiplication:

In [173]: (w*x[None,:]).sum(axis=1)
Out[173]: array([20, 56])

While this equivalent operation does use broadcasting, it's better not to think of dot in those terms.

matmul gives another description of the same action, adding a dimension to x to form a 2d by 2d matrix product, followed by a squeeze to remove the extra dimension. I don't think dot does that under the covers, but the result is the same.

This may also be called matrix vector multiplication, provided you don't insist on calling the 1d x a row vector or column vector.

Now for a 2d x, with shape (3,1):

In [175]: x2 = x[:,None]
In [176]: x2
Out[176]: 
array([[7],
       [2],
       [3]])
In [177]: x2.shape
Out[177]: (3, 1)
In [178]: np.dot(w,x2)
Out[178]: 
array([[20],
       [56]])
In [179]: np.einsum('ij,jk->ik',w,x2)
Out[179]: 
array([[20],
       [56]])

The sum is over j, the last axis of w, and 2nd to the last of x. To do the same with elementwise we have to use broadcasting to generate a 3d outer product, and then do the sum to reduce the dimension back to 2.

In [180]: (w[:,:,None]*x2[None,:,:]).sum(axis=1)
Out[180]: 
array([[20],
       [56]])

In this example a (2,3) dot (3,1) => (2,1). That's perfectly normal matrix product behavior. In the first (2,3) dot (3,) => (2,). To me this is a logical generalization. (3,) dot (3,) => scalar (as opposed to ()` is a bit more of a special case.

I suspect the first case is mainly a problem for people who see a (3,) shape and think (1,3), a row-vector. (2,3) dot (1,3) doesn't work, because of the mismatch between the 3 and the 1.

understand numpy.dot with N-d array and 1-d array

2 Answers2