raise LinAlgError("SVD did not converge") LinAlgError: SVD did not converge in matplotlib pca determination

Question

Code:

import numpy
from matplotlib.mlab import PCA
file_name = "store1_pca_matrix.txt"
ori_data = numpy.loadtxt(file_name,dtype='float', comments='#', delimiter=None,
            converters=None, skiprows=0, usecols=None, unpack=False, ndmin=0)
result = PCA(ori_data)

Though my input matrix is devoid of nan and inf, I do get the error below:

    raise LinAlgError("SVD did not converge")
LinAlgError: SVD did not converge

What's the problem?

this gives me an error `AttributeError: 'numpy.ndarray' object has no attribute 'dropna'`, how did you made it work? — Charlie Parker, Nov 15 '21 at 18:28

score 57 · Accepted Answer · edited Aug 06 '17 at 20:18

57

This can happen when there are inf or nan values in the data.

Use this to remove nan values:

ori_data.dropna(inplace=True)

edited Aug 06 '17 at 20:18

c-chavez

7,237
5
35
49

answered Mar 20 '14 at 04:00

jseabold

7,903
2
39
53

10

I ve checked my data thorouhly.. there are no inf and nans in the data.. what other possibilities does this error gets raised ? – user 3317704 Mar 20 '14 at 04:46
1

@user3317704 either you have missing values, or invalid ones, might have different types in the same column, etc. Is there a way that we can see your file to validate it? Have you tried this answer, and using the "dropna" function and still getting the error? – c-chavez Aug 23 '17 at 21:38
1

@user3317704 I had the same problem, but during debugging, I noticed that I concatenated two data frames incorrectly, so new data frame contains only NaN values – 32cupo Apr 06 '18 at 13:17
I don't get it, where do I run `ori_data.dropna(inplace=True)` before the input to SVD or when? – Charlie Parker Nov 15 '21 at 18:19
this gives me an error `AttributeError: 'numpy.ndarray' object has no attribute 'dropna'` – Charlie Parker Nov 15 '21 at 18:28

score 25 · Answer 2 · answered Sep 03 '17 at 16:49

I know this post is old, but in case someone else encounters the same problem. @jseabold was right when he said that the problem is nan or inf and the op was probably right when he said that the data did not have nan's or inf. However, if one of the columns in ori_data has always the same value, the data will get Nans, since the implementation of PCA in mlab normalizes the input data by doing

ori_data = (ori_data - mean(ori_data)) / std(ori_data).

The solution is to do:

result = PCA(ori_data, standardize=False)

In this way, only the mean will be subtracted without dividing by the standard deviation.

score 10 · Answer 3 · answered Jul 28 '19 at 23:40

10

If there are no inf or NaN values, possibly that is a memory issue. Please try in a machine with higher RAM.

answered Jul 28 '19 at 23:40

Paritosh Gupta

109
1
2

This was my problem, I had no nan values but opening task manager showed that I maxed out on RAM. – Eric Hedengren Sep 09 '20 at 18:51
why wouldn't the error message mention the memory or OMM? Seems mysterious that it would warn about svd instead... – Charlie Parker Nov 15 '21 at 18:18

score 7 · Answer 4 · answered Sep 09 '14 at 07:20

I do not have an answer to this question but I have the reproduction scenario with no nans and infs. Unfortunately the datataset is pretty large (96MB gzipped).

import numpy as np
from StringIO import StringIO
from scipy import linalg
import urllib2
import gzip

url = 'http://physics.muni.cz/~vazny/gauss/X.gz'
X = np.loadtxt(gzip.GzipFile(fileobj=StringIO(urllib2.urlopen(url).read())), delimiter=',')
linalg.svd(X, full_matrices=False)

which rise:

LinAlgError: SVD did not converge

on:

>>> np.__version__
'1.8.1'
>>> import scipy
>>> scipy.__version__
'0.10.1'

but did not raise an exception on:

>>> np.__version__
'1.8.2'
>>> import scipy
>>> scipy.__version__
'0.14.0'

so what was the source of the bug? – Charlie Parker Nov 15 '21 at 18:19 — Charlie Parker, Nov 15 '21 at 18:19

score 4 · Answer 5 · answered Mar 22 '19 at 04:56

4

Following on @c-chavez answer, what worked for me was first replacing inf and -inf to nan, then removing nan. For example:

data = data.replace(np.inf, np.nan).replace(-np.inf, np.nan).dropna()

answered Mar 22 '19 at 04:56

hevronig

41
1

score 3 · Answer 6 · answered Dec 10 '15 at 06:10

3

This may be due to the singular nature of your input datamatrix (which you are feeding to PCA)

answered Dec 10 '15 at 06:10

Sumit Waghmare

63
6

score 3 · Answer 7 · answered Mar 15 '17 at 15:50

3

Even if your data is correct, it may happen because it runs out of memory. In my case, moving from a 32-bit machine to a 64-bit machine with bigger memory solved the problem.

answered Mar 15 '17 at 15:50

Slava

31
1

score 3 · Answer 8 · answered Jun 09 '22 at 08:09

3

I had this error multiple times:

If the length of data is 1. Then it can't fit anything
If a value is infinity. You divided by 0 in your processing ?
If a value is None. This is very common.

answered Jun 09 '22 at 08:09

Ludo Schmidt

1,283
11
16

score 2 · Answer 9 · answered Mar 26 '20 at 09:17

2

This happened to me when I accidentally resized an image dataset to (0, 64, 3). Try checking the shape of your dataset to see if one of the dimensions is 0.

answered Mar 26 '20 at 09:17

chenjesu

734
6
14

score 1 · Answer 10 · answered Jun 08 '16 at 23:23

1

I am using numpy 1.11.0. If the matrix has more than 1 eigvalues equal to 0, then 'SVD did not converge' is raised.

answered Jun 08 '16 at 23:23

nos

19,875
27
98
134

score 0 · Answer 11 · answered Apr 29 '23 at 00:32

For me, this happened occasionally because some of my data was uninitialized, because somewhere down the line initialization was like so

a = np.empty((w, h))
a[some, where] = val  # only partial value assignment
result = np.linalg.pinv(a)

Notice np.empty does not allocate values to the array, but only allocate it, thus it contains garbage.

Solution: initialize all of the data. In my case

a = np.zeros((w, h))
a[some, where] = val  # only partial value assignment
result = np.linalg.pinv(a)

raise LinAlgError("SVD did not converge") LinAlgError: SVD did not converge in matplotlib pca determination

11 Answers11

Linked