1

I am performing non-negative least squares using scipy. A trivial example would be as follows:

import numpy as np
from scipy.optimize import nnls

A = np.array([[60, 70, 120, 60],[60, 90, 120, 70]], dtype='float32')
b = np.array([6, 5])
x, res = nnls(A, b)

Now, I have a situation where some entries in A or b can be missing (np.NaN). Something like,

A_2 = A.copy()
A_2[0,2] = np.NaN

Ofcourse, running NNLS on A_2, b will not work as scipy does not expect an inf or nan.

How can we perform NNLS masking out the missing entry from the computation. Effectively, this should translate to

Minimize |(A_2.x- b)[mask]|

where mask can be defined as:

mask = ~np.isnan(A_2)

In general, entries can be missing from both A and b.

Possibly helpful:

[1] How to include constraint to Scipy NNLS function solution so that it sums to 1

Community
  • 1
  • 1
Nipun Batra
  • 11,007
  • 11
  • 52
  • 77

1 Answers1

1

I think you can compute the mask first (determine which points you want included) and then perform NNLS. Given the mask

In []: mask
Out[]: 
array([[ True,  True, False,  True],
       [ True,  True,  True,  True]], dtype=bool)

you can verify whether to include a point by checking if all values in a column are True using np.all along the first axis.

In []: np.all(mask, axis=0)
Out[]: array([ True,  True, False,  True], dtype=bool)

This can then be used as a column mask for A.

In []: nnls(A_2[:,np.all(mask, axis=0)], b)
Out[]: (array([ 0.09166667,  0.        ,  0.        ]), 0.7071067811865482)

The same idea can be used for b to construct a row mask.

PidgeyUsedGust
  • 797
  • 4
  • 11