1

My model was chosen using a dataset with 180 observations and then 4 outliers were taken away so there are 176. We have to use it on a test set of data with 82 observations but it keeps displaying

warning: newdata has 82 rows but variables found has 176 rows".

How do I fix this?

Here is some of the code but I didn't post it all as most isn't relevant to the question. Thanks in advance!

OUTLIERS(XDATA=cbind(X3,X4,X5,X6,X9,X10),YDATA=Y)
#greatest outliers are 138, 161, 37, 116
#37 and 138 are very influential

#create dummy variables associated with factors
X2.=double(length(X2)) 
X2.[X2==2]=1
detach(diabetes)

data.=cbind(X2.,X3,X4,X5,X6,X9)
head(data.)
dim(data.)

lm(Y~data.)
fit9=lm(Y[c(-138,-161,-37,-116)]~data.[c(-138,-161,-37,-116),])

summary(fit9)

predictionA=predict(fit9,dataset$D.test) 
predictionA
Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
Jono
  • 11
  • 2
  • 1
    Please clean up your code to make it reproducible and easily readable. For example, you don't provide the variables `X3`, `X4`, `X5`, `X6`, `X9`, `X10`, `Y`, `X2`. You detach the `diabetes` dataset even though it doesn't appear to have been used. – Richie Cotton Sep 07 '14 at 12:20
  • 1
    In the meantime, while you fix your code, look at the help page for `predict.lm`. The `newdata` argument should be a data frame, not a vector. It isn't clear where the `dataset` argument comes from, but you might want to pass that, rather than just its `D.test` column. – Richie Cotton Sep 07 '14 at 12:23
  • When using predict, you need to pass in a new data.frame with exactly the same column names used as during the fit. If you're going to use the formula syntax, then you should use proper column names, other wise use `lm.fit` or something. Also it is extremely confusing to read variable names that end in `.`. I'd strong encourage you to use some other naming convention. If you actually posted a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), we could offer more specific suggestions. We should be able to paste the code into R. – MrFlick Sep 07 '14 at 14:04

0 Answers0