3

I have some data points to plot, and would like to add a best fit line to the graph, and then output the relevant metrics to indicate the quality of the best fit line.

I could plot the data, and polyfitis a function I used to add the best fit line. However, I am just wondering how could I get the metrics that indicates the quality of the best fit line?

I don't see polyfit returns any metrics (i.e. min square error value).

Data:

0,1717
1,1761
2,1961
3,1711
4,1285
5,976
6,721
7,428
8,313
9,297
10,375
11,521
12,678
13,752
14,728
15,758
16,741
17,812
18,845
19,863
20,933
21,1169
22,1523
23,1779
vvvvv
  • 25,404
  • 19
  • 49
  • 81
Kevin
  • 2,191
  • 9
  • 35
  • 49
  • You could take a look at [What's the error of numpy.polyfit?](http://stackoverflow.com/q/15721053/1730674) – askewchan Sep 24 '14 at 22:52
  • @Kevin also [**t-test**](http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html) is one way to go and test the goodness of fit. – Dalek Sep 24 '14 at 22:59
  • perhaps this would be helpful? http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.optimize.curve_fit.html – corvid Sep 24 '14 at 23:05
  • @askewchan & @corvid, you mean... instead of using the polyfit, I should compose my own polynomial function then use the `scipy.optimize.curve_fit`? – Kevin Sep 25 '14 at 00:53
  • @Dalek, but how can I get the error from the best fit line by using the `polyfit` function? – Kevin Sep 25 '14 at 00:54
  • @Kevin, no need to use `curve_fit`, you can get the residuals directly from `polyfit` as mentioned by both answers at the linked question. – askewchan Sep 25 '14 at 02:22

1 Answers1

1

mean squared error is one measure.

if you have numpy arrays of each point on your line and data:

numpy.mean((data - line_vals) ** 2)

edit: to get line_vals, if you have an equation y=mx+b for a line:

line_vals = b + m * numpy.linspace(0, 23, 24)
user1269942
  • 3,772
  • 23
  • 33
  • I used the `polyfit` function without having composed my own polynomial function... do you happen to know how can I get the `line_vals` from there? – Kevin Sep 25 '14 at 00:55
  • @Kevin, you could use `np.polyval` on the results from `np.polyfit`. Beware of the mismatch described [here](http://stackoverflow.com/a/18767992/1730674) – askewchan Sep 25 '14 at 02:25