1

I am trying to plot a linear regression line on a scatter plot. I have researched it (for example Linear regression with matplotlib / numpy) but my implimentation doesn't work

    x = [-6.0, -5.0, -10.0, -5.0, -8.0, -3.0, -6.0, -8.0, -8.0, 7.5, 8.0, 9.0, 10.0, 7.0, 5.0, 5.0, -8.0, 8.0, 7.0] 
    y = [-7.094043198985176, -6.1018562538660044, -15.511155265492038, -2.7131460277126984, -8.6127363078417609, -3.1575686002528163, -10.246242711042497, -6.4333658386991992, -16.167988119268013, 2.4709555610646134, 4.5492058088492948, 5.5896790992867942, 3.3824425476540005, -1.8140272426684692, -1.5975329456235758, 5.1403915611396904, -4.4469105070935955, 0.51211850576547091, 5.7059436876065952]
    m,b = numpy.polyfit(x,y,1)
    plt.plot(x, y, x, m*x+b) 
    plt.show()

Returns:

Traceback (most recent call last):
  File "test.py", line 464, in <module>
    correlate(trainingSet,trainingSet.trainingTexts)
  File "test.py", line 434, in correlate
    plt.plot(x, y, x, m*x+b)
  File "C:\Python27\lib\site-packages\matplotlib\pyplot.py", line 2458, in plot
    ret = ax.plot(*args, **kwargs)
  File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 3848, in plot
    for line in self._get_lines(*args, **kwargs):
  File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 323, in _grab_ne
xt_args
    for seg in self._plot_args(remaining, kwargs):
  File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 300, in _plot_ar
gs
    x, y = self._xy_from_xy(x, y)
  File "C:\Python27\lib\site-packages\matplotlib\axes.py", line 240, in _xy_from
_xy
    raise ValueError("x and y must have same first dimension")
ValueError: x and y must have same first dimension

I have checked that both x and y are the same length, and they are (19). Any ideas what I'm doing wrong?

Community
  • 1
  • 1
Zach
  • 4,624
  • 13
  • 43
  • 60

1 Answers1

6

The problem is that x is a python list, not a numpy array, so * works not the way you expected:

>>> m * x
[]

This is the working example

plt.plot(x, y, x, numpy.array(x) * m +b) 
unkulunkulu
  • 11,576
  • 2
  • 31
  • 49
  • still I'm new to `numpy` and I don't know _why_ `m*x == []` :D – unkulunkulu Jun 26 '12 at 13:21
  • 1
    This is quite interesting. m is of type `numpy.float64`. Apparently that type overrides `__mul__` (and `__rmul__`) such that `m*sequence` returns `int(m)*sequence`. Had m been a regular python float, an exception would have been raised since sequences can't multiply floats. – mgilson Jun 26 '12 at 13:28
  • @mgilson, haha, weird indeed, thanks for the info, I'm always forgetting that this language is interpreted and everything is in front of my eyes :) – unkulunkulu Jun 26 '12 at 13:30