Calculate multivariate linear regression with numpy

Question

1 - Using A = np.array([x1,x2,x3]) worked to fix the error in How I plot the linear regression.

So I decided increase the number of elements in x1,x2 and x3 and continue to use example in How I plot the linear regression, and now I get the error "ValueError: too many values to unpack". Numpy can't calculate with so many numbers?

>>> x1 = np.array([3,2,2,3,4,5,6,7,8])
>>> x2 = np.array([2,1,4.2,1,1.5,2.3,3,6,9])
>>> x3 = np.array([6,5,8,9,7,0,1,2,1])
>>> y = np.random.random(3)
>>> A = np.array([x1,x2,x3])
>>> m,c = np.linalg.lstsq(A,y)[0]
Traceback (most recent call last):
File "testNumpy.py", line 18, in <module>
  m,c = np.linalg.lstsq(A,y)[0]
ValueError: too many values to unpack

2 - I also compared my version with the one defined in Multiple linear regression with python. Which one is correct? Why they use the transpose in this example?

Thanks,

Post entire questions please. If your other question is removed, there is no trace as to what you're actually trying to achieve — Matti Lyra, Jan 28 '14 at 17:08
The error doesn't come from NumPy. Python it's telling you that `np.linalg.lstsq(A,y)[0]` is returning more than the two values (`m, c`) that you expect. — Ricardo Cárdenes, Jan 28 '14 at 17:12
If the posted answer to your previous question fixed the problem posed in that question, you would do well to accept the answer. Over time you will find that people help you less if you don't thank them or otherwise acknowledge their efforts (and it also lets others know not to waste time trying to figure out whether the posted answers are sufficient). — tom10, Jan 28 '14 at 17:18

score 2 · Answer 1 · edited Sep 18 '17 at 14:56

The unpack error doesn't come from NumPy it comes from you trying to unpack two values from the function call, when only one is returned, NOTE the [0] at the end of the line

>>> x1 = np.array([3,2,2,3,4,5,6,7,8])
>>> x2 = np.array([2,1,4.2,1,1.5,2.3,3,6,9])
>>> x3 = np.array([6,5,8,9,7,0,1,2,1])
>>> y = np.random.random(3)
>>> A = np.array([x1,x2,x3])
>>> print np.linalg.lstsq(A,y)[0]
array([ 0.01789803,  0.01546994,  0.01128087,  0.02851178,  0.02561285,
        0.00984112,  0.01332656,  0.00870569, -0.00064135])

compared to

>>> print np.linalg.lstsq(A,y)
(array([ 0.01789803,  0.01546994,  0.01128087,  0.02851178,  0.02561285,
         0.00984112,  0.01332656,  0.00870569, -0.00064135]),
 array([], dtype=float64), 
 3,
 array([ 21.78630954,  12.03873305,   3.8217304 ]))

See the numpy docs, the first array are the coefficients of the variables. I think the confusion here is a variable versus an observation. You currently have three observations, and nine variables. The A.T turns the variables into observations and vice versa.

I don't understand what are the results in the output. Are they the Betas defined in a linear regression (http://en.wikipedia.org/wiki/Linear_regression)? How can I get the 'c' in the equation y=mx+c from the output of np.linalg.lstsq? — xeon123, Jan 28 '14 at 17:23

Calculate multivariate linear regression with numpy

1 Answers1