4

I have the following set of data: https://archive.ics.uci.edu/ml/datasets/abalone

I am trying to plot a regression for the whole weight against the diameter.

A scatter plot of the data is clearly not a linear function. (I am unable to attach it for some reason.)

Consider a quadratic regression model. I set it up like so:

abalone <- read.csv("abalone.data")
diameter <- abalone$Diameter
diameter2 <- diameter^2
whole <- abalone$Whole.weight

quadraticModel <- lm( whole ~ diameter + diameter2)

This is fine and gives me the following when calling quadraticModel:

Call:
lm(formula = whole ~ diameter + diameter2)

Coefficients:
(Intercept)     diameter    diameter2  
     0.3477      -3.3555      10.4968  

However, when I plot:

abline(quadraticModel)

I get the following warning:

Warning message:
In abline(quadraticModel) :
  only using the first two of 3 regression coefficients

which means that I am getting a straight line plot which isn't what I am aiming for. Can someone please explain to me why this is happening and possible ways around it? I am also having the same issue with cubic plots etc. (They always just plot the first two coefficients.)

Joe
  • 53
  • 1
  • 1
  • 4
  • 1
    `abline` just draws straight lines. Have you tried `plot(quadraticModel)`? – Andrew Gustar Nov 06 '17 at 13:46
  • 1
    https://stackoverflow.com/questions/39736847/plot-regression-line-in-r ; https://stackoverflow.com/questions/35663828/plot-multiple-polynomial-regression-curve ; https://stackoverflow.com/questions/26959527/how-to-plot-quadratic-regression-in-r/26959959#26959959 ; https://stackoverflow.com/questions/23334360/plot-polynomial-regression-curve-in-r – user20650 Nov 06 '17 at 13:50
  • @AndrewGustar that plots the residuals against the fitted values by the way – Joe Nov 06 '17 at 14:05

2 Answers2

2

You can not use abline to plot polynomial regression fitted. Try this:

x<-sort(diameter)
y<-quadraticModel$fitted.values[order(diameter)]
lines(x, y) 
Henry Navarro
  • 943
  • 8
  • 34
0

I don't think you're producing a quadratic fit, rather a linear fit using diameter and the squared diameter. Try this instead:

library(stats)

df <- read.csv("abalone.data")
var_names <-
  c(
    "Sex",
    "Length",
    "Diameter",
    "Height",
    "Whole_weight",
    "Shucked_weight",
    "Viscera_weight",
    "Shell_weight",
    "Rings"
  )
colnames(df) <- var_names


fit <- lm(df$Whole_weight ~ poly(df$Diameter, 2))
summary(fit)

diameter <- df$Diameter
predicted_weight <- predict(fit, data.frame(x = diameter))

plot(diameter, predicted_weight)

enter image description here

> summary(fit)

Call:
lm(formula = df$Whole_weight ~ poly(df$Diameter, 2))

Residuals:
     Min       1Q   Median       3Q      Max 
-0.66800 -0.06579 -0.00611  0.04590  0.97396 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)    
(Intercept)            0.828818   0.002054  403.44   <2e-16 ***
poly(df$Diameter, 2)1 29.326043   0.132759  220.90   <2e-16 ***
poly(df$Diameter, 2)2  8.401508   0.132759   63.28   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1328 on 4173 degrees of freedom
Multiple R-squared:  0.9268,    Adjusted R-squared:  0.9267 
F-statistic: 2.64e+04 on 2 and 4173 DF,  p-value: < 2.2e-16
f.lechleitner
  • 3,554
  • 1
  • 17
  • 35
  • 2
    This isnt quite right: *I don't think you're producing a quadratic fit* as a *linear fit using diameter and the squared diameter* is a quadratic model. `poly(Diameter, 2, raw=TRUE)` should give the same coefficients as `Diameter + I(Diameter^2)` – user2957945 Nov 06 '17 at 15:25