1

I want to fit a function

f:(Weight,Length)-> Z

to a 2D surface using numpy.linalg.lstsq. My dataset Z is defined in a (Weight,Length) plane. However I have erased the values that are not relevant: All absolute values larger than 3.3 is set to NaN using:

Z=np.where(np.abs(Z)>3.3,np.NaN,Z)

I do not want the algorithm to work hard to get a match where it is not important, sacrificing precision where it actually matters.

I have rendered the values I want to fit for illustration. Btw. my model is a 2d polynom. surface plot of the dataset

I want to calculate least square roots using numpy. As I have NaN values, I have decided to mask them using:

Z = np.ma.array(Z, mask=np.isnan(Z)) # Use a mask to mark the NaNs

But the call

c, r, rank, s = np.linalg.lstsq(A, Z, rcond=None)

Still gives the known error

LinAlgError: SVD did not converge in Linear Least Squares

The code itself works if I do not set values to NaN.

How can I run the fit with NaN values, telling numpy to ignore those?

Aneho
  • 118
  • 7

1 Answers1

0

TLDR: np.linalg.lstsq(a.T[paddedMask].reshape(int(np.sum(nanMask)),(xOrder+1)*(yOrder+1)), np.ravel(z)[mask], rcond=None)

So I had the same issue and the solution is rather simple, so I want to share it, as I thought this wasn't going to work initially.

np.linalg.lstsq takes a matrix of shape (z.size,coeff.size) and data of shape z.size (flattened)

Let the order of the 2d polynominal be determined by xOrder and yOrder

What I did was to use a mask, pad it and then apply it to the matrix and and the unpadded to the data. A reshaping is needed as otherwise Numpy is flattening an array after use of a mask to 1d.

Padding can be done by using the outerproduct with np.outer(mask,vector_for_padding)

mask = np.ones(data.size,dtype="bool") #array with only true values
paddedMask = np.outer(mask,np.ones((xOrder+1)*(yOrder+1))).astype(bool) #padding to have the same shape as the matrix we want to remove the e.g. NaN values from
np.linalg.lstsq(matrix.T[paddedMask].reshape(int(np.sum(mask)),(xOrder+1)*(yOrder+1)),np.ravel(data)[mask],rcond=None)

The matrix I got was generated using the answer from Paddy here https://stackoverflow.com/a/57923405/8162482

Be aware there is a small mistake of i and jbeing switched in the for loop (see comment on the post too)