144

The numpy docs recommend using array instead of matrix for working with matrices. However, unlike octave (which I was using till recently), * doesn't perform matrix multiplication, you need to use the function matrixmultipy(). I feel this makes the code very unreadable.

Does anybody share my views, and has found a solution?

doug
  • 69,080
  • 24
  • 165
  • 199
elexhobby
  • 2,588
  • 5
  • 24
  • 33
  • 8
    You're asking for opinions and not a question. Is there something more specific we could help you with or perhaps guide you in making it more readable? – wheaties Oct 08 '10 at 12:59
  • 2
    Actually the docs recommend using matrix if you do linear algebra and don't wan't to use multiply() so whats the problem? – Matti Pastell Oct 08 '10 at 13:13
  • 1
    I haven't gone through the docs in detail. Just curious, what advantages do arrays offer over the matrix class? I found that arrays do not differentiate between rows and columns. Is it because arrays are supposed to be thought of as tensors rather than matrices? As Joe pointed out, the fact that matrix class is 2-dim is quite limiting. What's the thinking behind this kind of design, as in, why not have a single matrix class like matlab/octave? – elexhobby Oct 10 '10 at 16:23
  • I guess the main issue is that python doesn't have `.*` vs '*' syntax for element wise vs matrix multiplication. If it had that then it would all be simpler though I'm surprised they choose `*` to mean element-wise and not matrix multiplication. – Charlie Parker Aug 14 '17 at 23:41

8 Answers8

134

The main reason to avoid using the matrix class is that a) it's inherently 2-dimensional, and b) there's additional overhead compared to a "normal" numpy array. If all you're doing is linear algebra, then by all means, feel free to use the matrix class... Personally I find it more trouble than it's worth, though.

For arrays (prior to Python 3.5), use dot instead of matrixmultiply.

E.g.

import numpy as np
x = np.arange(9).reshape((3,3))
y = np.arange(3)

print np.dot(x,y)

Or in newer versions of numpy, simply use x.dot(y)

Personally, I find it much more readable than the * operator implying matrix multiplication...

For arrays in Python 3.5, use x @ y.

Neil G
  • 32,138
  • 39
  • 156
  • 257
Joe Kington
  • 275,208
  • 71
  • 604
  • 463
  • 10
    Its unreadable when you have a stack of multiplications, for instance x'*A'*A*x. – elexhobby Oct 19 '10 at 08:15
  • 15
    @elexhobby - `x.T.dot(A.T).dot(A).dot(x)` isn't that unreadable, i.m.o. To each his own, though. If you're primarily doing matrix multiplication, then by all means, use `numpy.matrix`! – Joe Kington Oct 19 '10 at 13:32
  • 7
    By the way, why is matrix multiplication called "dot"? In what sense is it a dot product? – amcnabb Mar 14 '13 at 18:13
  • 8
    @amcnabb - Matrix multiplication is sometimes referred to as a "dot product" in textbooks (in those books, the dot product you're thinking of is called a "scalar product" or "scalar dot product"). The scalar dot product is just matrix multiplication of two vectors, after all, so using "dot" to mean matrix multiplication in general isn't much of a stretch. That particular notation seems (?) more common in engineering and science texts than in mathematics, at least in my experience. Its prevalence in numpy is mostly because `numpy.matrixmultiply` is tough to type. – Joe Kington Mar 15 '13 at 00:00
  • 1
    @amcnabb - Also, `matrixmultiply` was depreciated many years ago, and has been removed in any recent-ish (`>v1.3`, maybe?) version of numpy. – Joe Kington Mar 15 '13 at 01:53
  • @JoeKington - Is the scalar dot product really just matrix multiplication of two vectors? If the * operator is matrix multiplication, then `u.T * v` would be the scalar dot product, but `u * v` would be undefined (since the dimensions don't match). If matrix multiplication is defined as a sum over the last axis of the left array with the second-to-last axis of the right array, then I would assume that matrix multiplication of 1-D numpy arrays would fail because the right array does not have a second-to-last axis. Anyway, that's why I'm confused. – amcnabb Mar 15 '13 at 17:21
  • @amcnabb - `numpy.dot` is special-cased to behave as `inner` for 1d vectors. The inner product of two 1D vectors is strictly identical to the scalar dot product. (Incidentally, `u.T` and `u` are identical if `u` is one-dimensional. "Row vectors" and "column vectors" are 2D, and aren't vectors, strictly speaking. In numpy, vectors are vectors, and don't have a 2nd dimension, so you have to reshape them into 2D to have a row vector or column vector.) – Joe Kington Mar 15 '13 at 23:10
  • 1
    @JoeKington - I appreciate that `numpy.dot` behaves as an inner product for one-dimensional vectors, but this special casing makes the function seem self-inconsistent. Maybe I would be less confused if it were more common for books to refer to "dot" for matrix multiplication as you describe. Anyway, I understand it enough to use it, but I hate the obligation I feel to litter code with comments explaining that `numpy.dot` performs matrix multiplication. – amcnabb Mar 18 '13 at 19:17
  • 7
    @amcnabb the point is that dot [generalizes to arbitrary dimensionality](http://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html#numpy.dot) without ambiguity. It is this that makes `numpy.dot` equivalent to matrix multiplication. If you really dislike the notation, use the `matrix` class. – Henry Gomersall Apr 21 '13 at 07:35
  • 3
    Hope you don't mind I updated your answer for Python 3.5 – Neil G Jan 25 '15 at 13:57
  • 1
    For me dot also seems unnatural name, and it was confusing at first. – Scientist1642 Oct 07 '15 at 12:43
  • An important point to note here is * is element-wise multiplication, dot is the true matrix multiplication. Please see http://stackoverflow.com/a/18255635/1780570 – Minh Triet Dec 24 '15 at 00:18
  • @amcnabb It's true that the matrix multiplication and the scalar product between two vectors are performed very similarly; however, the scalar product between two vectors commute, while the the matrix multiplication do not. – HelloGoodbye Jun 21 '17 at 17:38
  • I guess the main issue is that python doesn't have `.*` vs '*' syntax for element wise vs matrix multiplication. If it had that then it would all be simpler though I'm surprised they choose `*` to mean element-wise and not matrix multiplication. – Charlie Parker Aug 14 '17 at 23:41
  • using `dot` for matrix multiplication is TERRIBLE notation since the vector dot product and matrix multiple are very different operations. For one instance, the dot product of two vectors is commutative, i.e. `a.dot(b) == b.dot(a)` (or at least should be per the definition of the dot product). In matrix multiplication `np.matmul(a,b) != np.matmul(b,a)` – schrödinbug Jul 10 '19 at 11:16
  • `[[1,1], [1,2]] @ [[75,0], [0,64]]`, unsupported operand type(s) for @: 'list' and 'list', python 3.9. – huang Mar 18 '22 at 10:20
83

the key things to know for operations on NumPy arrays versus operations on NumPy matrices are:

  • NumPy matrix is a subclass of NumPy array

  • NumPy array operations are element-wise (once broadcasting is accounted for)

  • NumPy matrix operations follow the ordinary rules of linear algebra

some code snippets to illustrate:

>>> from numpy import linalg as LA
>>> import numpy as NP

>>> a1 = NP.matrix("4 3 5; 6 7 8; 1 3 13; 7 21 9")
>>> a1
matrix([[ 4,  3,  5],
        [ 6,  7,  8],
        [ 1,  3, 13],
        [ 7, 21,  9]])

>>> a2 = NP.matrix("7 8 15; 5 3 11; 7 4 9; 6 15 4")
>>> a2
matrix([[ 7,  8, 15],
        [ 5,  3, 11],
        [ 7,  4,  9],
        [ 6, 15,  4]])

>>> a1.shape
(4, 3)

>>> a2.shape
(4, 3)

>>> a2t = a2.T
>>> a2t.shape
(3, 4)

>>> a1 * a2t         # same as NP.dot(a1, a2t) 
matrix([[127,  84,  85,  89],
        [218, 139, 142, 173],
        [226, 157, 136, 103],
        [352, 197, 214, 393]])

but this operations fails if these two NumPy matrices are converted to arrays:

>>> a1 = NP.array(a1)
>>> a2t = NP.array(a2t)

>>> a1 * a2t
Traceback (most recent call last):
   File "<pyshell#277>", line 1, in <module>
   a1 * a2t
   ValueError: operands could not be broadcast together with shapes (4,3) (3,4) 

though using the NP.dot syntax works with arrays; this operations works like matrix multiplication:

>> NP.dot(a1, a2t)
array([[127,  84,  85,  89],
       [218, 139, 142, 173],
       [226, 157, 136, 103],
       [352, 197, 214, 393]])

so do you ever need a NumPy matrix? ie, will a NumPy array suffice for linear algebra computation (provided you know the correct syntax, ie, NP.dot)?

the rule seems to be that if the arguments (arrays) have shapes (m x n) compatible with the a given linear algebra operation, then you are ok, otherwise, NumPy throws.

the only exception i have come across (there are likely others) is calculating matrix inverse.

below are snippets in which i have called a pure linear algebra operation (in fact, from Numpy's Linear Algebra module) and passed in a NumPy array

determinant of an array:

>>> m = NP.random.randint(0, 10, 16).reshape(4, 4)
>>> m
array([[6, 2, 5, 2],
       [8, 5, 1, 6],
       [5, 9, 7, 5],
       [0, 5, 6, 7]])

>>> type(m)
<type 'numpy.ndarray'>

>>> md = LA.det(m)
>>> md
1772.9999999999995

eigenvectors/eigenvalue pairs:

>>> LA.eig(m)
(array([ 19.703+0.j   ,   0.097+4.198j,   0.097-4.198j,   5.103+0.j   ]), 
array([[-0.374+0.j   , -0.091+0.278j, -0.091-0.278j, -0.574+0.j   ],
       [-0.446+0.j   ,  0.671+0.j   ,  0.671+0.j   , -0.084+0.j   ],
       [-0.654+0.j   , -0.239-0.476j, -0.239+0.476j, -0.181+0.j   ],
       [-0.484+0.j   , -0.387+0.178j, -0.387-0.178j,  0.794+0.j   ]]))

matrix norm:

>>>> LA.norm(m)
22.0227

qr factorization:

>>> LA.qr(a1)
(array([[ 0.5,  0.5,  0.5],
        [ 0.5,  0.5, -0.5],
        [ 0.5, -0.5,  0.5],
        [ 0.5, -0.5, -0.5]]), 
 array([[ 6.,  6.,  6.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]]))

matrix rank:

>>> m = NP.random.rand(40).reshape(8, 5)
>>> m
array([[ 0.545,  0.459,  0.601,  0.34 ,  0.778],
       [ 0.799,  0.047,  0.699,  0.907,  0.381],
       [ 0.004,  0.136,  0.819,  0.647,  0.892],
       [ 0.062,  0.389,  0.183,  0.289,  0.809],
       [ 0.539,  0.213,  0.805,  0.61 ,  0.677],
       [ 0.269,  0.071,  0.377,  0.25 ,  0.692],
       [ 0.274,  0.206,  0.655,  0.062,  0.229],
       [ 0.397,  0.115,  0.083,  0.19 ,  0.701]])
>>> LA.matrix_rank(m)
5

matrix condition:

>>> a1 = NP.random.randint(1, 10, 12).reshape(4, 3)
>>> LA.cond(a1)
5.7093446189400954

inversion requires a NumPy matrix though:

>>> a1 = NP.matrix(a1)
>>> type(a1)
<class 'numpy.matrixlib.defmatrix.matrix'>

>>> a1.I
matrix([[ 0.028,  0.028,  0.028,  0.028],
        [ 0.028,  0.028,  0.028,  0.028],
        [ 0.028,  0.028,  0.028,  0.028]])
>>> a1 = NP.array(a1)
>>> a1.I

Traceback (most recent call last):
   File "<pyshell#230>", line 1, in <module>
   a1.I
   AttributeError: 'numpy.ndarray' object has no attribute 'I'

but the Moore-Penrose pseudoinverse seems to works just fine

>>> LA.pinv(m)
matrix([[ 0.314,  0.407, -1.008, -0.553,  0.131,  0.373,  0.217,  0.785],
        [ 1.393,  0.084, -0.605,  1.777, -0.054, -1.658,  0.069, -1.203],
        [-0.042, -0.355,  0.494, -0.729,  0.292,  0.252,  1.079, -0.432],
        [-0.18 ,  1.068,  0.396,  0.895, -0.003, -0.896, -1.115, -0.666],
        [-0.224, -0.479,  0.303, -0.079, -0.066,  0.872, -0.175,  0.901]])

>>> m = NP.array(m)

>>> LA.pinv(m)
array([[ 0.314,  0.407, -1.008, -0.553,  0.131,  0.373,  0.217,  0.785],
       [ 1.393,  0.084, -0.605,  1.777, -0.054, -1.658,  0.069, -1.203],
       [-0.042, -0.355,  0.494, -0.729,  0.292,  0.252,  1.079, -0.432],
       [-0.18 ,  1.068,  0.396,  0.895, -0.003, -0.896, -1.115, -0.666],
       [-0.224, -0.479,  0.303, -0.079, -0.066,  0.872, -0.175,  0.901]])
doug
  • 69,080
  • 24
  • 165
  • 199
  • 3
    mInv = NP.linalg.inv(m) computes the inverse of an array – db1234 Jan 19 '15 at 16:27
  • An important point to note here is * is element-wise multiplication, dot is the true matrix multiplication. Please see http://stackoverflow.com/a/18255635/1780570 – Minh Triet Dec 24 '15 at 00:13
  • 1
    IMP note: numpy matrices are to be avoided in favor of arrays. Note from documentation --> "It is no longer recommended to use this class, even for linear algebra. Instead use regular arrays. The class may be removed in the future." See also https://stackoverflow.com/a/61156350/6043669 – HopeKing Jun 28 '20 at 07:39
22

In 3.5, Python finally got a matrix multiplication operator. The syntax is a @ b.

Petr Viktorin
  • 65,510
  • 9
  • 81
  • 81
  • 2
    Thanks! Yay, glad to see that I'm not the only one who feels that the current notation is unreadable. – elexhobby Aug 25 '14 at 17:32
15

There is a situation where the dot operator will give different answers when dealing with arrays as with dealing with matrices. For example, suppose the following:

>>> a=numpy.array([1, 2, 3])
>>> b=numpy.array([1, 2, 3])

Lets convert them into matrices:

>>> am=numpy.mat(a)
>>> bm=numpy.mat(b)

Now, we can see a different output for the two cases:

>>> print numpy.dot(a.T, b)
14
>>> print am.T*bm
[[1.  2.  3.]
 [2.  4.  6.]
 [3.  6.  9.]]
Jadiel de Armas
  • 8,405
  • 7
  • 46
  • 62
  • To be specific, * is element-wise multiplication, dot is the true matrix multiplication. Please see http://stackoverflow.com/a/18255635/1780570 – Minh Triet Dec 24 '15 at 00:16
  • That is because as a numpy array, a.T == a, the transpose doesnt do anything. – patapouf_ai Jul 14 '16 at 07:06
  • If you write at = np.array([[1],[2],[3]]), then numpy.dot(at,b) should give you same. The difference between matix and array is not in the dot but in the transpose. – patapouf_ai Jul 14 '16 at 07:08
  • Or actually, if you write a = numpy.array([[1,2,3]]) then a.T will really transpose and everything will work just like in matrices. – patapouf_ai Jul 14 '16 at 07:12
8

Reference from http://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html

..., the use of the numpy.matrix class is discouraged, since it adds nothing that cannot be accomplished with 2D numpy.ndarray objects, and may lead to a confusion of which class is being used. For example,

>>> import numpy as np
>>> from scipy import linalg
>>> A = np.array([[1,2],[3,4]])
>>> A
    array([[1, 2],
           [3, 4]])
>>> linalg.inv(A)
array([[-2. ,  1. ],
      [ 1.5, -0.5]])
>>> b = np.array([[5,6]]) #2D array
>>> b
array([[5, 6]])
>>> b.T
array([[5],
      [6]])
>>> A*b #not matrix multiplication!
array([[ 5, 12],
      [15, 24]])
>>> A.dot(b.T) #matrix multiplication
array([[17],
      [39]])
>>> b = np.array([5,6]) #1D array
>>> b
array([5, 6])
>>> b.T  #not matrix transpose!
array([5, 6])
>>> A.dot(b)  #does not matter for multiplication
array([17, 39])

scipy.linalg operations can be applied equally to numpy.matrix or to 2D numpy.ndarray objects.

Yong Yang
  • 1,242
  • 12
  • 8
7

This trick could be what you are looking for. It is a kind of simple operator overload.

You can then use something like the suggested Infix class like this:

a = np.random.rand(3,4)
b = np.random.rand(4,3)
x = Infix(lambda x,y: np.dot(x,y))
c = a |x| b
Bitwise
  • 7,577
  • 6
  • 33
  • 50
5

A pertinent quote from PEP 465 - A dedicated infix operator for matrix multiplication , as mentioned by @petr-viktorin, clarifies the problem the OP was getting at:

[...] numpy provides two different types with different __mul__ methods. For numpy.ndarray objects, * performs elementwise multiplication, and matrix multiplication must use a function call (numpy.dot). For numpy.matrix objects, * performs matrix multiplication, and elementwise multiplication requires function syntax. Writing code using numpy.ndarray works fine. Writing code using numpy.matrix also works fine. But trouble begins as soon as we try to integrate these two pieces of code together. Code that expects an ndarray and gets a matrix, or vice-versa, may crash or return incorrect results

The introduction of the @ infix operator should help to unify and simplify python matrix code.

cod3monk3y
  • 9,508
  • 6
  • 39
  • 54
1

Function matmul (since numpy 1.10.1) works fine for both types and return result as a numpy matrix class:

import numpy as np

A = np.mat('1 2 3; 4 5 6; 7 8 9; 10 11 12')
B = np.array(np.mat('1 1 1 1; 1 1 1 1; 1 1 1 1'))
print (A, type(A))
print (B, type(B))

C = np.matmul(A, B)
print (C, type(C))

Output:

(matrix([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]]), <class 'numpy.matrixlib.defmatrix.matrix'>)
(array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]]), <type 'numpy.ndarray'>)
(matrix([[ 6,  6,  6,  6],
        [15, 15, 15, 15],
        [24, 24, 24, 24],
        [33, 33, 33, 33]]), <class 'numpy.matrixlib.defmatrix.matrix'>)

Since python 3.5 as mentioned early you also can use a new matrix multiplication operator @ like

C = A @ B

and get the same result as above.

Serenity
  • 35,289
  • 20
  • 120
  • 115