0

I'm multiplying two matrix of shape (3,2,2,2) and shape (2,2,2,2) which as far as I understand should multiply correctly.

np.random.randn(3,2,2,2)@np.random.randn(2,2,2,2)

Raises the error


ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (3,2,2,2)->(3,2,newaxis,newaxis) (2,2,2,2)->(2,2,newaxis,newaxis)  and requested shape (2,2)

Seeing it in the context of 3x2 and 2x2 matrix with each element as 2x2, the matrix multiplication should work correctly, but doesn't. Looking for correction here.

Edit: using np.dot(np.random.randn(3,2,2,2),np.random.randn(2,2,2,2)) does result in a valid multiplication however, the resultant shape is (3,2,2,2,2,2) which is not expected. Following conventional rules the output shape should be (3,2,2,2).

  • Does this answer your question? [python numpy ValueError: operands could not be broadcast together with shapes](https://stackoverflow.com/questions/24560298/python-numpy-valueerror-operands-could-not-be-broadcast-together-with-shapes) – LoneWanderer Nov 23 '22 at 17:44
  • 1
    What shape do you want the result to be? `matmul/@` pairs the last 2 dimensions in the conventional matrix multiplication way - last dim of A with the 2nd to the last of B. Since they are both 2, that part's ok. But it uses `numpy broadcasting` rules for the first 2 dimensions. (3,2) does not broadcast with (2,2). – hpaulj Nov 23 '22 at 19:02
  • @hpaulj I'd expect the shape of the result to be ```(3,2,2,2)``` given conventional matrix rules. Note that np.dot does give me a valid result of shape ```(3, 2, 2, 2, 2, 2)``` which is not expected. I see that you're explaining just that but I don't think I understand it yet. – Manu Dwivedi Nov 24 '22 at 15:59
  • Have you read the `np.matmul` docs? `np.dot` handles the larger dimensions different. It still does the sum-of-products with 'last-2nd to last rule'. You may want to demonstrate the calculation with loops, or with `np.einsum`. – hpaulj Nov 24 '22 at 16:10
  • 1
    Matrix multiplication performs a `sum-of-products` on one pair of dimensions. For 2d arrays, you (manually) run your finger across the columns of one array while going down the rows of the other, a (3,2) can multiply a (2,4) to produce a (3,4). `dot` and `matmul` extend that 2d to higher dimensions, but with different rules. `dot` in effect does an `outer` product, while `matmul` uses `broadcasting` rules. – hpaulj Nov 24 '22 at 17:09
  • Both `matmul` and `dot` are doing `np.dot(A[i,j,:,:], B[k,l,:,:]), but with different ways of mixing the `i,j,k,l` dimensions. The `dot/matrix product` is done on the last 2 dimensions. – hpaulj Nov 24 '22 at 17:21
  • @hpaulj np.matmul performs np.dot under the hood as far as I remember which seems to be precise when I do matmul of the two matrices instead. It seems using np.einsum to do the matrix operation is the only way. Edit: You're correct, matmul uses broadcasting rules instead. So I could iterate through the dimensions to perform the intended multiplication instead? – Manu Dwivedi Nov 24 '22 at 20:07
  • There are no "conventional rules" for matrix multiplication involving 4d arrays. MATLAB doesn't allow that. `numpy` has extended the conventions, but you have to either follow its rules, or be explicit about what you want done. – hpaulj Nov 25 '22 at 17:03

0 Answers0