6

I found a link where it is shown with an example that the Matlab mldivide operator (\) gives 'special' solutions when the system of linear equations has infinitely many solutions.

For example:

A = [1 2 0; 0 4 3];
b = [8; 18];
c_mldivide = A \ b
c_pinv = pinv(A) * b

gives the output:

c_mldivide =
                 0
                 4
  0.66666666666667


c_pinv =

  0.918032786885245
  3.54098360655738
  1.27868852459016

The solution is 'special' in the sense that the number of non-zero entries in the solution c_mldivide is equal to rank(A) (in this case 2). I tried the same thing in numpy using numpy.linalg.lstsq, which gives an identical result to c_pinv.

Is there a way to achieve the c_mldivide solution in Python?

There was another very similar question here, but I suppose the explanation of the word 'special' was not clear enough. Another question asked about the internal workings of the mldivide operator, but the accepted answer doesn't seem to address this behavior.

Edit 1 : numpy code

In [149]: test_A = np.array([[1,2,0],[0,4,3]])
          test_b = np.array([[8],[18]])
          np.linalg.lstsq(test_A,test_b)

Out[149]:
(array([[ 0.918 ],
    [ 3.541 ],
    [ 1.2787]]), array([], dtype=float64), 2, array([ 5.2732,  1.4811]))

Edit 2 : Using scipy.optimize.nnls

In[189]:
from scipy.optimize import nnls
nnls(test_A,test_b)
Out[190]:
    ValueError                                Traceback (most recent call last)
<ipython-input-165-19ed603bd86c> in <module>()
      1 from scipy.optimize import nnls
      2 
----> 3 nnls(test_A,test_b)

C:\Users\abhishek\Anaconda\lib\site-packages\scipy\optimize\nnls.py in nnls(A, b)
     43         raise ValueError("expected matrix")
     44     if len(b.shape) != 1:
---> 45         raise ValueError("expected vector")
     46 
     47     m, n = A.shape

    ValueError: expected vector
Community
  • 1
  • 1

2 Answers2

5

Non-negative least squares (scipy.optimize.nnls) is not a general solution to this problem. A trivial case where it will fail is if all of the possible solutions contain negative coefficients:

import numpy as np
from scipy.optimize import nnls

A = np.array([[1, 2, 0],
              [0, 4, 3]])
b = np.array([-1, -2])

print(nnls(A, b))
# (array([ 0.,  0.,  0.]), 2.23606797749979)

In the case where A·x = b is underdetermined,

x1, res, rnk, s = np.linalg.lstsq(A, b)

will pick a solution x' that minimizes ||x||L2 subject to ||A·x - b||L2 = 0. This happens not to be the particular solution we are looking for, but we can linearly transform it to get what we want. In order to do that, we'll first compute the right null space of A, which characterizes the space of all possible solutions to A·x = b. We can get this using a rank-revealing QR decomposition:

from scipy.linalg import qr

def qr_null(A, tol=None):
    Q, R, P = qr(A.T, mode='full', pivoting=True)
    tol = np.finfo(R.dtype).eps if tol is None else tol
    rnk = min(A.shape) - np.abs(np.diag(R))[::-1].searchsorted(tol)
    return Q[:, rnk:].conj()

Z = qr_null(A)

Z is a vector (or, in case where n - rnk(A) > 1, a set of basis vectors spanning a subspace of A) such that A·Z = 0:

print(A.dot(Z))
# [[  0.00000000e+00]
#  [  8.88178420e-16]]

In other words, the column(s) of Z are vectors that are orthogonal to all of the rows in A. This means that for any solution x' to A·x = b, then x' = x + Z·c must also be a solution for any arbitrary scaling factor c. This means that by picking an appropriate value of c, we can set any n - rnk(A) of the coefficients in the solution to zero.

For example, let's say we wanted to set the value of the last coefficient to zero:

c = -x1[-1] / Z[-1, 0]
x2 = x1 + Z * c
print(x2)
# [ -8.32667268e-17  -5.00000000e-01   0.00000000e+00]
print(A.dot(x2))
# [-1. -2.]

The more general case where n - rnk(A) ≤ 1 is a little bit more complicated:

A = np.array([[1, 4, 9, 6, 9, 2, 7],
              [6, 3, 8, 5, 2, 7, 6],
              [7, 4, 5, 7, 6, 3, 2],
              [5, 2, 7, 4, 7, 5, 4],
              [9, 3, 8, 6, 7, 3, 1]])
x_exact = np.array([ 1,  2, -1, -2,  5,  0,  0])
b = A.dot(x_exact)
print(b)
# [33,  4, 26, 29, 30]

We get x' and Z as before:

x1, res, rnk, s = np.linalg.lstsq(A, b)
Z = qr_null(A)

Now in order to maximise the number of zero-valued coefficients in the solution vector, we want to find a vector C such that

x' = x + Z·C = [x'0, x'1, ..., x'rnk(A)-1, 0, ..., 0]T

If the last n - rnk(A) coefficients in x' are to be zeros, this imposes that

Z{rnk(A),...,n}·C = -x{rnk(A),...,n}

We can thus solve for C (exactly, since we know that Z[rnk:] must be full-rank):

C = np.linalg.solve(Z[rnk:], -x1[rnk:])

and compute x' :

x2 = x1 + Z.dot(C)
print(x2)
# [  1.00000000e+00   2.00000000e+00  -1.00000000e+00  -2.00000000e+00
#    5.00000000e+00   5.55111512e-17   0.00000000e+00]
print(A.dot(x2))
# [ 33.   4.  26.  29.  30.]

To put it all together into a single function:

import numpy as np
from scipy.linalg import qr

def solve_minnonzero(A, b):
    x1, res, rnk, s = np.linalg.lstsq(A, b)
    if rnk == A.shape[1]:
        return x1   # nothing more to do if A is full-rank
    Q, R, P = qr(A.T, mode='full', pivoting=True)
    Z = Q[:, rnk:].conj()
    C = np.linalg.solve(Z[rnk:], -x1[rnk:])
    return x1 + Z.dot(C)
ali_m
  • 71,714
  • 23
  • 223
  • 298
  • I'll work through this tmrw, but if it's ok (what I assume) - wouldn't that make a fine scipy extension? – decltype_auto Nov 09 '15 at 23:49
  • Scipy documentation is being a bit careless. QR decomposition with column pivoting is not the same as rank-revealing decomposition. QR with pivoting is one algorithm for rank-revealing decomposition, which may sometimes fail to find the correct rank. It should also be noted that the solution minimizing the number of non-zero components in x is not unique. One may demand that any n-rnk(A) components of x' are zero, not necessarily the last ones. – Mikhail Kagalenko Dec 09 '17 at 15:35
  • @MikhailKagalenko I don't think anyone's claiming that QR decomposition with column pivoting is the only rank-revealing decomposition, or that it's always going to find the correct rank. For example, `np.linalg.matrix_rank` uses SVD, which is slower than QR for large matrices but probably more numerically stable. In my answer I already pointed out that we can arbitrarily set any n-rnk(A) of the coefficients in x' to zero, but OP was specifically interested in Matlab-style solutions with the last coefficients set to zero. – ali_m Jan 08 '18 at 23:21
  • Isn't there something like MATLAB's ``\`` in Numpy / Scipy? It seems `np.linalg.lstsq(mA, vB)` works only for squared matrices. I find a general solver in Numpy / Scipy which can handle "Fat" / "Long" matrices the way MATLAB's ``\`` handles them. – Royi Jun 25 '18 at 15:25
  • @Royi `np.linalg.lstsq` *does* handle non-square matrices, both in the underdetermined and overdetermined cases.The difference from MATLAB's \ is the choice of solution in the underdetermined case (of which there are infinitely many). Whereas `lstsq` minimizes the L2 norm of the solution, \ minimizes the number of non-zero coefficients in it. As I showed, you can get from one to the other by translating the solution within the nullspace of `A`. – ali_m Jul 04 '18 at 09:14
  • @ali_m, I was wondering if Python has a replication of MATLAB's ``\`` operator. By the way, why do you say MATLAB, in the uderdetrmined case, minimizes the number of non zero elements (The $ {L}_{0} $ Pseudo Norm)? – Royi Jul 04 '18 at 10:45
4
np.array([[8],[18]]).shape

is

(2,1) 

but you want

(2,)

#!/usr/bin/env python3
import numpy as np
from scipy.optimize import nnls
test_A = np.array([[1,2,0],[0,4,3]])

try:
    test_b = np.array([[8],[18]]) # wrong
    print(nnls(test_A,test_b))
except Exception as e:
    print(str(e))

test_b = np.array([8,18]) # sic!
print(nnls(test_A,test_b))

output:

expected vector
(array([ 0.        ,  4.        ,  0.66666667]), 0.0)
decltype_auto
  • 1,706
  • 10
  • 19