0

Tried to run the following predict by selecting columns:

Here's the prior code:

model <- glm(Automatable1Y0N ~.,family=binomial(link='logit'),data=train)
data <- subset(training.data.raw,select=c(5,6,7,8,10,12,27))
train <- data[1:800,]
test <- data[801:957,]

model <- glm(Automatable1Y0N ~.,family=binomial(link='logit'),data=train)
anova(model, test="Chisq")

NO issues on this 2 lines. And when I ran this predict line:

fitted.results <- predict(model,newdata=subset(test,select=c(5,6,7,8,10,12)),type='response')

but I'm receiving this error:

Error in `[.data.frame`(x, r, vars, drop = drop) : undefined columns selected

Can someone please help? Thanks.

I tried to extract this subset command:

newdata <-subset(test,select=c(5,))

Here are the results, when I added column 8 that's when error came up:

> newdata <-subset(test,select=c(5))
> newdata <-subset(test,select=c(5,6))
> newdata <-subset(test,select=c(5,6,7))
> newdata <-subset(test,select=c(5,6,7,8))
Error in `[.data.frame`(x, r, vars, drop = drop) : 
  undefined columns selected
DP8
  • 55
  • 1
  • 7
  • 1
    Probably a variable name mismatch between `model` and `newdata`. But without a reproducible example this is hard to confirm. – Paul Hiemstra Feb 24 '17 at 06:45
  • Thanks for your edit, but this does help in us reproducing your problem on our system. For example, we do not have the `train` and `test`objects. See [here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for more information. – Paul Hiemstra Feb 24 '17 at 06:52

1 Answers1

0

I see what's causing that error: the command:

data <- subset(training.data.raw,select=c(5,6,7,8,10,12,27))

Here you have now 7 columns, from this command:

fitted.results <- predict(model,newdata=subset(test,select=c(5,6,7,8,10,12)),type='response')

Instead of specifying the original columns, it should now be

select=c(1,2,3,4,5,6,7)

because dataset has now 7 columns. Thanks for providing feedback though @Paul

DP8
  • 55
  • 1
  • 7