1

Hello I'm now learning linear regression. And I want to draw linear regression graph from the data that I made.

if there is a data as indicated below,

one_cycle = [(0, 401.92), (5, 103.62), (7, 62.8), (8, 28.26), (10, 10.55)]

I used statsmodels.api.OLS and got the regression results.

def basic_regression(one_cycle):
    #one_cycle would be [(0,100),(1,75)...]
    X, Y = [x[0] for x in one_cycle], [x[1] for x in one_cycle]
    
    X = numpy.array(X).T
    X = statsmodels.api.add_constant(X)
    results = statsmodels.api.OLS(Y, X).fit()

    return results

and When I draw graph of the results,

def draw(results):
    fig, ax = plt.subplots()
    fig = statsmodels.api.graphics.plot_fit(results, 0, ax=ax)
    ax.ticklabel_format(useOffset=False)
    plt.show()

The graph is not what I expected like the following. This is the image link.. I can't upload the image

Instead of that graph, I expected this graph: this graph
(source: sourceforge.net)

How can I draw grah like that? Thank you have a good day.

Community
  • 1
  • 1
dizwe
  • 71
  • 2
  • 10
  • 1
    I think what you want is to plot with respect to the variable in the second column of X instead of the constant in the first column. i.e. use 1 for the x_var index `statsmodels.api.graphics.plot_fit(results, 1, ax=ax)` – Josef Jan 22 '17 at 14:51

1 Answers1

0

The problem occurs in the lines

X = numpy.array(X).T
X = statsmodels.api.add_constant(X)

After changing X to a numpy array, you reassign a constant to X, which means that from this point on X is a 2D array with ones in the first column.

The solution is to remove those two lines entirely.

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
  • **Don't** do that. Without adding a constant the regression will be through the origin. – Josef Jan 22 '17 at 14:49
  • @user333700 I don't see **any** difference between adding the constant or not, even if `0` is not in the data. Could you explain the difference and in which case that would matter? – ImportanceOfBeingErnest Jan 22 '17 at 15:05
  • add_constant is a FAQ for statsmodels because it doesn't follow the "automatic" constant tradition of most packages. see for example http://stackoverflow.com/questions/38836465/how-to-get-the-regression-intercept-using-statsmodels-api/38838570#38838570 http://stackoverflow.com/questions/11495051/difference-in-python-statsmodels-ols-and-rs-lm – Josef Jan 22 '17 at 15:31