17

I needed to compute Q^N for quite a lot of various values of N (between 1 to 10000) and Numpy was a bit too slow.

I've asked on math.stackexchange.com if I could avoid to compute Q^N for my specific need and someone answered me that computing Q^N should be quite fast using the P D^N P^-1 method.

So basically, instead of doing:

import numpy as np
from numpy import linalg as LA
...
LA.matrix_power(m, N)

I've tried :

diag, P = LA.eig(m)
DN = np.diag(diag**N)
P1 = LA.inv(P)

P*DN*P1

And I obtain the same matrix as result (tried on a single example)

On a more complex matrix, Q:

% timeit.Timer('Q**10000', setup=setup).repeat(2, 100)
[5.87254786491394, 5.863131046295166]

% timeit.Timer('diag, P = linalg.eig(Q); DN=np.diag(diag**10000);P1=linalg.inv(P); P*DN*P1', setup=setup).repeat(2, 100)
[2.0032401084899902, 2.018735885620117]

And regarding my initial problem, the second method allows me to compute P, diag and P1only once and use it thousands of times. It's 8 times faster using this method.

My questions are:

  • In which case it is not possible to use this last method to compute Q^N?
  • Is it fine to use it in my case (matrix Q as given here)?
  • Is there in numpy a function that already does it?

Edit:

  • It appears that for another matrix, P is not invertible. So I am adding a new question: how can I change the matrix P so it becomes invertible but the resulting matrix is not too altered? I mean, it's ok if the values are close to the real result, by close I mean ~0.0001.
Community
  • 1
  • 1
Maxime Chéramy
  • 17,761
  • 8
  • 54
  • 75
  • Regarding my 2 first questions, I guess Q must not be defective. But I don't know if my matrices are defective or not (because of my math background too far). – Maxime Chéramy Sep 20 '13 at 15:16
  • 1
    You could further speed this up by doing `diag**10000` using the exponentiation by squaring method. See [my answer to another question](http://stackoverflow.com/a/18453999/15055) where I implement it in numpy. – Claudiu Sep 20 '13 at 15:18
  • @Claudiu wow. I naively thought that diag**10000 would use the squaring method! However, in my case, I might even use the possibility to pass floating numbers anyway. That also something I could not use with LA.matrix_power. – Maxime Chéramy Sep 20 '13 at 15:22
  • 1
    You can do this as long as the matrix is diagonalizable. If the matrix Q is real and symmetric, you will always be able to perform this. Bonne chance, Maxime. – Wok Sep 20 '13 at 15:41
  • I agree wok, as I said before, I don't know if my matrices are defective or not (wikipedia says: "A square matrix which is not diagonalizable is called defective."). The matrix Q is not symmetric (it is triangular). Maybe I should do something with the determinant of the matrix, don't know, I need to dig on mathematic theorems I guess. Merci ;). – Maxime Chéramy Sep 20 '13 at 16:44
  • I know it is not what you asked, but if you want to do this stuf often, consider using Matlab. It is rather optimized towards matrix multiplication. – Dennis Jaheruddin Sep 23 '13 at 16:17
  • @DennisJaheruddin I intend to compute this into a simulation tool written in Python, it is not possible to do it with an external tool. – Maxime Chéramy Sep 23 '13 at 16:25

3 Answers3

3

You have already figured out that your eigenvalues will be (0, a, b, c, ..., 1). Let me rename your parameters, so that the eigenvalues are (0, e1, e2, e3, ..., 1). To find out the eigenvector (v0, v1, v2, ..., v(n-1)) corresponding to eigenvalue ej, you have to solve the system of equations:

v1                    = v0*ej
v1*e1 + v2*(1-e1)     = v1*ej
v2*e2 + v3*(1-e2)     = v2*ej
...
vj*ej + v(j+1)*(1-ej) = vj*ej
...
v(n-1)                = v(n-1)*ej

It is more or less clear that if all your ei are distinct, and none is equal to 0 or 1, then the solution is well defined always, and when dealing with ej, the resulting eigenvector has the first j components non-zero, and the rest equal to zero. This guarantees that no eigenvector is a linear combination of the others, and hence that the eigenvector matrix is invertible.

The problem comes when some of your ei is either 0, or 1, or is repeated. I haven't been able to come up with a proof of it, but experimenting with the following code it seems that you should only worry if any two of your ei are equal and different from 1:

>>> def make_mat(values):
...     n = len(values) + 2
...     main_diag = np.concatenate(([0], values, [1]))
...     up_diag = 1 - np.concatenate(([0], values))
...     return np.diag(main_diag) + np.diag(up_diag, k=1)
>>> make_mat([4,5,6])
array([[ 0,  1,  0,  0,  0],
       [ 0,  4, -3,  0,  0],
       [ 0,  0,  5, -4,  0],
       [ 0,  0,  0,  6, -5],
       [ 0,  0,  0,  0,  1]])
>>> a, b = np.linalg.eig(make_mat([4,5,6]))
>>> a
array([ 0.,  4.,  5.,  6.,  1.])
>>> b
array([[ 1.        ,  0.24253563, -0.18641093,  0.13608276,  0.4472136 ],
       [ 0.        ,  0.9701425 , -0.93205465,  0.81649658,  0.4472136 ],
       [ 0.        ,  0.        ,  0.31068488, -0.54433105,  0.4472136 ],
       [ 0.        ,  0.        ,  0.        ,  0.13608276,  0.4472136 ],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.4472136 ]])

And now for some test cases:

>>> a, b = np.linalg.eig(make_mat([1,0,3])) # having a 0 or 1 is OK
>>> b
array([[ 1.        ,  0.70710678,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.70710678,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  1.        ,  0.31622777,  0.57735027],
       [ 0.        ,  0.        ,  0.        ,  0.9486833 ,  0.57735027],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.57735027]])
>>> a, b = np.linalg.eig(make_mat([1,1,3])) # repeating 1 is OK
>>> b
array([[ 1.        ,  0.70710678,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.70710678,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  1.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  1.        ,  0.70710678],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.70710678]])
>>> a, b = np.linalg.eig(make_mat([0,0,3])) # repeating 0 is not OK
>>> np.round(b, 3)
array([[ 1.   , -1.   ,  1.   ,  0.035,  0.447],
       [ 0.   ,  0.   ,  0.   ,  0.105,  0.447],
       [ 0.   ,  0.   ,  0.   ,  0.314,  0.447],
       [ 0.   ,  0.   ,  0.   ,  0.943,  0.447],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.447]])
>>> a, b = np.linalg.eig(make_mat([2,3,3])) # repeating other values are not OK
>>> np.round(b, 3)
array([[ 1.   ,  0.447, -0.229, -0.229,  0.447],
       [ 0.   ,  0.894, -0.688, -0.688,  0.447],
       [ 0.   ,  0.   ,  0.688,  0.688,  0.447],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.447],
       [ 0.   ,  0.   ,  0.   ,  0.   ,  0.447]])
Jaime
  • 65,696
  • 17
  • 124
  • 159
  • I am currently reading your answer, thank you. But just a live comment: the sum of each line of my matrix is 1 (because it is a transition matrix of a markov chain). – Maxime Chéramy Sep 20 '13 at 17:41
  • Yes, that is taken into account. It is the `a`, `b`, `c`... that shouldn't be equal to each other, unless they are equal to `1`. – Jaime Sep 20 '13 at 17:53
  • Ah ah indeed :). But a, b, c etc are in fact probabilities. That what I should have said sorry. – Maxime Chéramy Sep 20 '13 at 17:55
3

I am partially answering my question:

According to the source code, I think Numpy is using Exponentiation by Squaring:

# binary decomposition to reduce the number of Matrix
# multiplications for n > 3.
beta = binary_repr(n)
Z, q, t = M, 0, len(beta)
while beta[t-q-1] == '0':
    Z = N.dot(Z, Z)
    q += 1
result = Z
for k in range(q+1, t):
    Z = N.dot(Z, Z)
    if beta[t-k-1] == '1':
        result = N.dot(result, Z)
return result

Which is slower in my case, when n is large, than computing the eigenvalues and eigenvectors and compute M^N as equal to P D^N P^-1.

Now, regarding my questions:

In which case it is not possible to use this last method to compute Q^N?

When some eigenvalues are equal, it will not be possible to invert P. someone has suggested to do it in Numpy on the issue tracker. The answer was: "Your approach is only valid for non-defective dense matrices."

Is it fine to use it in my case (matrix Q as given here)?

Not always, I might have several equal eigenvalues.

Is there in numpy a function that already does it?

I think it is in SciPy: https://github.com/scipy/scipy/blob/v0.12.0/scipy/linalg/matfuncs.py#L57

So we can also do this:

LA.expm(n*LA.logm(m))

to compute m^n.

How can I change the matrix P so it becomes invertible but the resulting matrix is not too altered? I mean, it's ok if the values are close to the real result, by close I mean ~0.0001.

I cannot simply add an epsilon value because the decomposition method is sensible when the values are too close. I'm pretty sure that could lead to unpredictable errors.

Maxime Chéramy
  • 17,761
  • 8
  • 54
  • 75
  • The matrix exponent option is interesting (i.e. using `la.expm`), but it seems to be even slower than `matrix_power` on my machine. Diagonalizing whenever possible is probably your best bet. – IanH Sep 23 '13 at 21:05
0

In question:

In which case it is not possible to use this last method to compute Q^N?

The key idea is to tell whether Q is diagonalizable. Equivalently, we should tell whether Q has n (number of its rows/columns) linearly independent eigenvectors. Note that distinct eigenvalues are just sufficient but not necessary condition of diagonalizability.