1

I have run a series of multiple linear regression models and am running diagnostic plots using the method and code found via this link (http://www.r-bloggers.com/checking-glm-model-assumptions-in-r/)

I have no more than 53 data points for every model, however some of the outliers in the regression plots are labeled as above 53... ranging from 58-107. Do the labels of outliers or influential points in the regression plots not correlate with each individual data point? If so what do the labels mean and how do I know which of my data points are the outliers? I have counted my data points in my plots and none of them have more than 53.

I have attached a screenshot of my regression plot output. There are 53 points in this plot, however two of the notable points are labeled 90 and 106. Regression plot example

enter image description here

steveb
  • 5,382
  • 2
  • 27
  • 36
Z. Doe
  • 11
  • 2
  • When asking questions, it helps to include [reproducible examples](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) so we can verify what's wrong rather than guess. Did you subset your data at some point before fitting the model? The labels may be the row names of the data.frame and not the row indexes. – MrFlick Feb 23 '16 at 20:40

1 Answers1

1

plot.lm labels the points with the corresponding row names:

set.seed(42)
DF <- data.frame(x = 1:5, y = 2 + 3 * 1:5 + rnorm(5))
rownames(DF) <- letters[1:5]
DF$y[3] <- 1e3

mod <- lm(y ~ x, data = DF)
par(mfrow = c(2,2))
plot(mod, 1:4)

resulting plot

Roland
  • 127,288
  • 10
  • 191
  • 288
  • Roland, I appreciate your feedback but still do not understand. I am referring to the labels near the outlying circles on my graphs. For example on the Residuals vs. Fitted graph where it says 90 near a point in the upper left. What does this 90 mean? I only have 53 data points in this model. – Z. Doe Feb 23 '16 at 18:20
  • I'm not sure what you don't understand. The 90 is the row name of that observation. Apparently your data.frame is the result of a subsetting operation. – Roland Feb 24 '16 at 07:24
  • 2
    E.g., have a look at `DF["90",]` (where `DF` is your data.frame). – Roland Feb 24 '16 at 07:28