0

I am trying to plot a graph with the calculated linear regression, but I get the error "ValueError: x and y must have same first dimension". This is a multivariate (2 variables) linear regression with 3 samples (x1,x2,x3).

1 - First, I am calculating the linear regression correctly?

2 - I know that the error comes from the plot lines. I just don't understand why I get this error. What is the right dimensions to put in the plot?

    import numpy as np
    import matplotlib.pyplot as plt

    x1 = np.array([3,2])
    x2 = np.array([1,1.5])
    x3 = np.array([6,5])
    y = np.random.random(3)
    A = [x1,x2,x3]

    m,c = np.linalg.lstsq(A,y)[0]

    plt.plot(A, y, 'o', label='Original data', markersize=10)
    plt.plot(A, m*A + c, 'r', label='Fitted line')
    plt.legend()
    plt.show()


    $ python  testNumpy.py
    Traceback (most recent call last):
      File "testNumpy.py", line 22, in <module>
        plt.plot(A, m*A + c, 'r', label='Fitted line')
      File "/usr/lib/pymodules/python2.7/matplotlib/pyplot.py", line 2987, in plot
        ret = ax.plot(*args, **kwargs)
      File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 4137, in plot
        for line in self._get_lines(*args, **kwargs):
      File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 317, in     _grab_next_args
        for seg in self._plot_args(remaining, kwargs):
      File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 295, in _plot_args
        x, y = self._xy_from_xy(x, y)
      File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 237, in _xy_from_xy
        raise ValueError("x and y must have same first dimension")
    ValueError: x and y must have same first dimension
xeon123
  • 819
  • 1
  • 10
  • 25

1 Answers1

1

The problem here is that you're creating a list A where you want an array instead. m*A is not doing what you expect.

This:

A = np.array([x1, x2, x3])

will get rid of the error.

NB: multiplying a list A and an integer m gives you a new list with the original content repeated m times. Eg.

>>> [1, 2] * 4
[1, 2, 1, 2, 1, 2, 1, 2]

Now, m being a floating point number should have raised a TypeError (because you can only multiply lists by integers)... but m turns out to be a numpy.float64, and it seems like when you multiply it to some unexpected thing (or a list, who knows), NumPy coerces it to an integer.

Ricardo Cárdenes
  • 9,004
  • 1
  • 21
  • 34