0

In R, I am trying to overlay an abline onto a plot, the result of linear regression. I want to create a scatter plot showing TrainRegRpt$train.data.Price (original price) on the x-axis, TrainRegress$fitted.values (the projected price that came from the lm model) on the y-axis and draw the line of best fit through the plotted points.

Here is some of my code:

TrainRegress <- lm(PriceBH.df$Price ~ ., data=PriceBH.df, subset = train.rows)
TrainRegRpt  <- data.frame(train.data$Price, TrainRegress$fitted.values, TrainRegress$residuals)
x <- as.vector(TrainRegRpt$TrainRegress.fitted.values)    # on the x-axis
y <- as.vector(TrainRegRpt$train.data.Price)     #on the y-axis
plot(TrainRegRpt$train.data.Price ~ TrainRegRpt$TrainRegress.fitted.values)
abline(x,y)

The scatter plot came out the same:

x <- as.vector(newdf$fv)
y <- as.vector(newdf$p)
p <-as.vector(TrainRegRpt$train.data.Price) # my y-axis in the scatter plot
fv <- as.vector(round(TrainRegRpt$TrainRegress.fitted.values,2) # my y-axis in the scatter plot
newdf<- dfrm <- data.frame(p,fv)
plot(newdf$p ~ newdf$fv)
abline(x,y)
summary(TrainRegress)

The following is the summary of TrainRegress: Coefficients obtained from the summary of TrainRegress:

Intercept  Estimate     
................30.318
CRIM.........0.245
CHAS......5.8368
RM..........8.4846

I extracted the y-intercept as follows:

y.interceptval <-summary(TrainRegress)$coefficients[1]

I will use y.interceptval in the abline(y.interceptval,***?slope***) but I need to know how to calculate the slope. How do I calculate the slope to pass to abline(y.interceptval, slope)?

I have 5 textbooks here that are no help and my professor refuses to help me and I really want this to be perfect! Thank you!!!

plot(TrainRegRpt$train.data.Price ~ TrainRegRpt$TrainRegress.fitted.values)<br>
abline(x,y)

Plot

Ian Campbell
  • 23,484
  • 14
  • 36
  • 57
  • 2
    Hi Andrea, welcome to Stack Overflow. It will be much easier to help if you provide a reproducible example with code that can be run by others. One step towards this is to provide at least a sample of your data with `dput(TrainRegRpt[1:20,])`. You can edit your question and paste the output. You can surround it with three backticks (```) for better formatting. See [How to make a reproducible example](https://stackoverflow.com/questions/5963269/) for more info. – Ian Campbell May 12 '20 at 05:03
  • 1
    Hi Ian, I provided code, is that not enough? How much of my code should I include? From reading in the data file? Including the splitting of records into training and validation? I just included the ln() function I ran and the results I based the rest of my code off of, I am new at this so any guidance you can offer would be great! – Andrea Whittaker May 12 '20 at 05:11
  • 3
    The problem isn't that we don't have access to your code, it's that we can't *run* your code because we don't have the required data objects. For example I can't run your `lm` call because I don't have `PriceBH.df`. I can't run your scatter plot because I don't have access to `newdf`. Sometimes providing the data isn't feasible, so an alternative approach is to make simulated data that matches your data closely enough that we can run your code. This is covered in the first answer to the question in link I provided above. In sum, to be able to help, we should be able to run your code. – Ian Campbell May 12 '20 at 05:15

1 Answers1

0

It looks like you already calculated your slope. The slopes from a linear regression analysis using lm() are the coefficients. So, in this case, 30.318 is your Y-intercept.

This gives you a regression equation of:

Y = 30.318 + 0.245*(CRIM) + 5.8368*(CHAS) + 8.4846*(RM)

The numbers 0.245, 5.8368, and 8.4846 are the coefficients for each variable and they are also the individual slopes.

Also, one thing about your fitted vs reesiduals plot, it looks like you reversed the way abline() is supposed to be (i.e. instead of abline(x,y) it should be abline(y,x).

Edit You used abline(x,y) but your plotted data are

plot(TrainRegRpt$train.data.Price ~ TrainRegRpt$TrainRegress.fitted.values)

(train.data.Price vs. Fitted Values not x vs y).

samrizz4
  • 516
  • 5
  • 11