4

problem description

For a square matrix, one can obtain the SVD

X= USV'

decomposition, by using simply numpy.linalg.svd

u,s,vh = numpy.linalg.svd(X)     

routine or numpy.linalg.eigh, to compute the eig decomposition on Hermitian matrix X'X and XX'

Are they using the same algorithm? Calling the same Lapack routine?

Is there any difference in terms of speed? and stability?

  • 2
    As much as I like the explanations francis has given below, which I voted up myself. You really should do some research in understanding the matrices, you have in front of you. That's why Matlab easily outperforms many bootstrapped user tests for say matrix inversions. The first thing they do before they do anything is figure out, what tricks they can pull off on the matrix. And that is, what numpy is missing. Don't get me wrong: I love numpy and the effort people put in the package. Also be mindful of what BLAS you link against. There are orderes of magnitudes in runtime differences. – Kaveh Vahedipour May 17 '18 at 05:57
  • And one word on memory alignment. I don't know, if you doing memory allocation yourself at all or relying on numpy. You can screw up everything by doing that wrong too. I have written an extensive open source library for medical image reconstruction and have learned some of the first lessons the hard way. – Kaveh Vahedipour May 17 '18 at 06:00
  • Eigenvalues and SVD singular values can be very different beasts: try `np.array([ [-149., -50, -154], [537, 180, 546], [-27, -9, -25] ])`, gallery3 from Cleve Moler. – denis Jan 01 '21 at 10:58
  • @denis not sure what do you mean.. but if I understood correctly, my post wasn't related to your intent. – ArtificiallyIntelligence Jan 02 '21 at 03:07

2 Answers2

12

Indeed, numpy.linalg.svd and numpy.linalg.eigh do not call the same routine of Lapack. On the one hand, numpy.linalg.eigh refers to LAPACK's dsyevd() while numpy.linalg.svd makes use LAPACK's dgesdd().

The common point between these routines is the use of Cuppen's divide and conquer algorithm, first designed to solve tridiagonal eigenvalue problems. For instance, dsyevd() only handles Hermitian matrix and performs the following steps, only if eigenvectors are required:

  1. Reduce matrix to tridiagonal form using DSYTRD()

  2. Compute the eigenvectors of the tridiagonal matrix using the divide and conquer algorithm, through DSTEDC()

  3. Apply the Householder reflection reported by DSYTRD() using DORMTR().

On the contrary, to compute the SVD, dgesdd() performs the following steps, in the case job==A (U and VT required):

  1. Bidiagonalize A using dgebrd()
  2. Compute the SVD of the bidiagonal matrix using divide and conquer algorithm using DBDSDC()
  3. Revert the bidiagonalization using using the matrices P and Q returned by dgebrd() applying dormbr() twice, once for U and once for V

While the actual operations performed by LAPACK are very different, the strategies are globally similar. It may stem from the fact that computing the SVD of a general matrix A is similar to performing the eigendecomposition of the symmetric matrix A^T.A.

Regarding accuracy and performances of lapack divide and conquer SVD, see This survey of SVD methods:

  • They often achieve the accuracy of QR-based SVD, though it is not proven
  • The worst case is O(n^3) if no deflation occurs, but often proves better than that
  • The memory requirement is 8 times the size of the matrix, which can become prohibitive

Regarding the symmetric eigenvalue problem, the complexity is 4/3n^3 (but often proves better than that) and the memory footprint is about 2n^2 plus the size of the matrix. Hence, the best choice is likely numpy.linalg.eigh if your matrix is symmetric.

The actual complexity can be computed for your particular matrices using the following code:

import numpy as np
from matplotlib import pyplot as plt
from scipy.optimize import curve_fit

# see https://stackoverflow.com/questions/41109122/fitting-a-curve-to-a-power-law-distribution-with-curve-fit-does-not-work
def func_powerlaw(x, m, c):
    return np.log(np.abs( x**m * c))

import time

start = time.time()
print("hello")
end = time.time()
print(end - start)

timeev=[]
timesvd=[]
size=[]

for n in range(10,600):
    print n
    size.append(n)
    A=np.zeros((n,n))
    #populate A, 1D diffusion. 
    for j in range(n):
        A[j,j]=2.
        if j>0:
            A[j-1,j]=-1.
        if j<n-1:
            A[j+1,j]=-1.

    #EIG
    Aev=A.copy()
    start = time.time()
    w,v=np.linalg.eigh(Aev,'L')
    end = time.time()
    timeev.append(end-start)
    Asvd=A.copy()
    start = time.time()
    u,s,vh=np.linalg.svd(Asvd)
    end = time.time()
    timesvd.append(end-start)


poptev, pcov = curve_fit(func_powerlaw, size[len(size)/2:], np.log(timeev[len(size)/2:]),p0=[2.1,1e-7],maxfev = 8000)
print poptev

poptsvd, pcov = curve_fit(func_powerlaw, size[len(size)/2:], np.log(timesvd[len(size)/2:]),p0=[2.1,1e-7],maxfev = 8000)
print poptsvd

plt.figure()

fig, ax = plt.subplots()

plt.plot(size,timeev,label="eigh")
plt.plot(size,[np.exp(func_powerlaw(x, poptev[0], poptev[1])) for x in size],label="eigh-adjusted complexity: "+str(poptev[0]))

plt.plot(size,timesvd,label="svd")
plt.plot(size,[np.exp(func_powerlaw(x, poptsvd[0], poptsvd[1])) for x in size],label="svd-adjusted complexity: "+str(poptsvd[0]))


ax.set_xlabel('n')
ax.set_ylabel('time, s')

#plt.legend(loc="upper left")

ax.legend(loc="lower right")
ax.set_yscale("log", nonposy='clip')

fig.tight_layout()



plt.savefig('eigh.jpg')
plt.show()

For such 1D diffusion matrices, eigh outperforms svd, but the actual complexity are similar, slightly lower than n^3, something like n^2.5. enter image description here

Checking of the accuracy could be performed as well.

brethvoice
  • 350
  • 1
  • 4
  • 14
francis
  • 9,525
  • 2
  • 25
  • 41
  • By the way, [scipy.linalg.eigh](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.eigh.html) can get just a few eigenvalues and eigenvectors -- smallest, or biggest, or those in a range like 0 .. 0.1. It then uses Lapack dysevr, but for all eigenpairs that may be slower than the default dysevd; on a 10k x 10k matrix from scikit-learn 20newsgroups, evd took 140 seconds * average 3.8 cores, evr 1740 seconds * 1.2 cores. YMMV. – denis Sep 25 '20 at 14:30
3

No they do not use the same algorithm as they do different things. They are somewhat related but also very different. Let's start with the fact that you can do SVD on m x n matrices, where m and n don't need to be the same.

Dependent on the version of numpy, you are doing. Here are the eigenvalue routines in lapack for double precision:
http://www.netlib.org/lapack/explore-html/d9/d8e/group__double_g_eeigen.html
And the according SVD routines:
http://www.netlib.org/lapack/explore-html/d1/d7e/group__double_g_esing.html

There are differences in routines. Big differences. If you care for the details, they are specified in the fortran headers very well. In many cases it makes sense to find out, what kind of matrix you have in front of you, to make a good choice of routine. Is the matrix symmetric/hermitian? Is it in upper diagonal form? Is it positive semidefinite? ...

There are gynormous differences in runtime. But as rule of thumb EIGs are cheaper than SVDs. But that depends also on convergence speed, which in turn depends a lot on condition number of the matrix, in other words, how ill posed a matrix is, ...

SVDs are usually very robust and slow algorithms and oftentimes used for inversion, speed optimisation through truncation, principle component analysis really with the expetation, that the matrix you are dealing with is just a pile of shitty rows ;)

Kaveh Vahedipour
  • 3,412
  • 1
  • 14
  • 22