I have the following optimization scheme implemented under NNLS in scipy.
import numpy as np
from scipy.optimize import nnls
from scipy import stats
#Define problem
A = np.array([[60., 90., 120.],
[30., 120., 90.]])
b = np.array([6700.5, 699.,])
# Add ones to ensure the solution sums to 1
b = np.hstack([b,1.0])
A = np.vstack([A,np.ones(3)])
x, rnorm = nnls(A,b)
print x
# the solution is
# [ 93.97933792 0. 0. ]
# we expect it to sum to 1 if it's not skewed
As you can see the b
vector is much higher than values in A
.
My question is what's the best/reasonable way to scale A
and b
so that the solution
is not skewed.
Note that both A
and b
are gene expression raw data without pre-processing.