Questions tagged [linear-regression]

for issues related to linear regression modelling approach

Linear Regression is a formalization of relationships between variables in the form of mathematical equations. It describes how one or more random variables are related to one or more other variables. Here the variables are not deterministically but stochastically related.

Example

Height and age are probabilistically distributed over humans. They are stochastically related; when you know that a person is of age 30, this influences the chance of this person being 4 feet tall. When you know that a person is of age 13, this influences the chance of this person being 6 feet tall.

Model 1

heighti = b0 + b1agei + εi, where b0 is the intercept, b1 is a parameter that age is multiplied by to get a prediction of height, ε is the error term, and i is the subject

Model 2

heighti = b0 + b1agei + b2sexi + εi, where the variable sex is dichotomous

In linear regression, user data X is modelled using linear functions Y, and unknown model parameters W are estimated or learned from the data. E.g., a linear regression model for a k-dimensional user data can be represented as :

Y = w1 x1 + w2 x2 + ... + wk xk

Reading Statistical Modeling: The Two Cultures http://projecteuclid.org/download/pdf_1/euclid.ss/1009213726

In scientific software r for statistical computing and graphics, function lm (see lm) implements linear regression.

6517 questions

313

votes

10 answers

Add regression line equation and R^2 on graph

I wonder how to add regression line equation and R^2 on the ggplot. My code is: library(ggplot2) df <- data.frame(x = c(1:100)) df$y <- 2 + 3 * df$x + rnorm(100, sd = 40) p <- ggplot(data = df, aes(x = x, y = y)) + geom_smooth(method =…

r ggplot2 linear-regression r-faq ggpmisc

asked Sep 26 '11 at 00:52

MYaseen208

22,666
37
165
309

279

votes

15 answers

What is the difference between linear regression and logistic regression?

When we have to predict the value of a categorical (or discrete) outcome we use logistic regression. I believe we use linear regression to also predict the value of an outcome given the input values. Then, what is the difference between the two…

machine-learning data-mining linear-regression

asked Aug 27 '12 at 17:49

London guy

27,522
44
121
179

230

votes

7 answers

How to do exponential and logarithmic curve fitting in Python? I found only polynomial fitting

I have a set of data and I want to compare which line describes it best (polynomials of different orders, exponential or logarithmic). I use Python and Numpy and for polynomial fitting there is a function polyfit(). But I found no such functions for…

python numpy scipy curve-fitting linear-regression

asked Aug 08 '10 at 07:36

Tomas Novotny

7,547
9
26
23

185

votes

6 answers

Adding a regression line on a ggplot

I'm trying hard to add a regression line on a ggplot. I first tried with abline but I didn't manage to make it work. Then I tried this... data =…

r ggplot2 regression linear-regression

asked Mar 26 '13 at 09:40

Remi.b

17,389
28
87
168

160

votes

6 answers

How to force R to use a specified factor level as reference in a regression?

How can I tell R to use a certain level as reference if I use binary explanatory variables in a regression? It's just using some level by default. lm(x ~ y + as.factor(b)) with b {0, 1, 2, 3, 4}. Let's say I want to use 3 instead of the zero that…

r regression linear-regression categorical-data dummy-variable

asked Oct 06 '10 at 11:46

Matt Bannert

27,631
38
141
207

153

votes

15 answers

Multiple linear regression in Python

I can't seem to find any python libraries that do multiple regression. The only things I find only do simple regression. I need to regress my dependent variable (y) against several independent variables (x1, x2, x3, etc.). For example, with this…

python numpy statistics scipy linear-regression

asked Jul 13 '12 at 22:14

Zach

4,624
13
43
60

133

votes

10 answers

Linear Regression and group by in R

I want to do a linear regression in R using the lm() function. My data is an annual time series with one field for year (22 years) and another for state (50 states). I want to fit a regression for each state so that at the end I have a vector of lm…

r regression linear-regression lm

asked Jul 23 '09 at 04:00

JD Long

59,675
58
202
294

113

votes

8 answers

Linear regression with matplotlib / numpy

I'm trying to generate a linear regression on a scatter plot I have generated, however my data is in list format, and all of the examples I can find of using polyfit require using arange. arange doesn't accept lists though. I have searched high and…

python numpy matplotlib linear-regression curve-fitting

asked May 27 '11 at 05:32

user771224

111

votes

8 answers

Accuracy Score ValueError: Can't Handle mix of binary and continuous target

I'm using linear_model.LinearRegression from scikit-learn as a predictive model. It works and it's perfect. I have a problem to evaluate the predicted results using the accuracy_score metric. This is my true Data : array([1, 1, 0, 0, 0, 0, 1, 1, 0,…

python machine-learning scikit-learn linear-regression prediction

asked Jun 24 '16 at 13:57

Arij SEDIRI

2,088
7
25
43

votes

8 answers

How to overplot a line on a scatter plot in python?

I have two vectors of data and I've put them into pyplot.scatter(). Now I'd like to over plot a linear fit to these data. How would I do this? I've tried using scikitlearn and np.polyfit().

python numpy matplotlib linear-regression scatter-plot

asked Sep 28 '13 at 16:05

goldisfine

4,742
11
59
83

votes

4 answers

Linear regression analysis with string/categorical features (variables)?

Regression algorithms seem to be working on features represented as numbers. For example: This data set doesn't contain categorical features/variables. It's quite clear how to do regression on this data and predict price. But now I want to do a…

python machine-learning regression linear-regression feature-selection

asked Nov 30 '15 at 20:21

Erba Aitbayev

4,167
12
46
81

votes

4 answers

why gradient descent when we can solve linear regression analytically

what is the benefit of using Gradient Descent in the linear regression space? looks like the we can solve the problem (finding theta0-n that minimum the cost func) with analytical method so why we still want to use gradient descent to do the same…

machine-learning linear-regression gradient-descent

asked Aug 12 '13 at 16:18

John

2,107
3
22
39

votes

6 answers

How to get a regression summary in scikit-learn like R does?

As an R user, I wanted to also get up to speed on scikit. Creating a linear regression model(s) is fine, but can't seem to find a reasonable way to get a standard summary of regression output. Code example: # Linear Regression import numpy as…

python r scikit-learn linear-regression summary

asked Oct 11 '14 at 21:04

mpg

3,679
8
36
45

votes

5 answers

gradient descent using python and numpy

def gradient(X_norm,y,theta,alpha,m,n,num_it): temp=np.array(np.zeros_like(theta,float)) for i in range(0,num_it): h=np.dot(X_norm,theta) #temp[j]=theta[j]-(alpha/m)*( np.sum( (h-y)*X_norm[:,j][np.newaxis,:] ) ) …

python numpy machine-learning linear-regression gradient-descent

asked Jul 22 '13 at 09:55

Madan Ram

votes

1 answer

How to calculate the 95% confidence interval for the slope in a linear regression model in R

Here is an exercise from Introductory Statistics with R: With the rmr data set, plot metabolic rate versus body weight. Fit a linear regression model to the relation. According to the fitted model, what is the predicted metabolic rate for a body…

r statistics linear-regression confidence-interval

asked Mar 02 '13 at 22:09

Yu Fu

1,151
1
8
15

2 3

…

99 100 Next