-3

I am trying to use the predict function to predict the values of a logistic regression and I am getting the incorrect number of rows. This question has already been asked R Warning: newdata' had 15 rows but variables found have 22 rows

and I have tried the approach but I still get the error. Here is the code

# Split as training and test sets
train_idx <- trainTestSplit(adult,trainPercent=75,seed=1111)
train <- adult[train_idx, ]
test <- adult[-train_idx, ]


xtrain <- train[,1:7]
ytrain <- train[,8]
xtrain1 <- dummy.data.frame(xtrain, sep = ".")
xtrain2 <- as.matrix(xtrain1)

xtest <- test[,1:7]
ytest <- test[,8]
xtest1 <- dummy.data.frame(xtest, sep = ".")
xtest2 <- as.matrix(xtest1)

fit=glm(ytrain~xtrain2,family=binomial)
a=predict(fit,newdata=xtrain1,type="response")
b=ifelse(a>0.5,1,0)
confusionMatrix(b,ytrain)
Confusion Matrix and Statistics

          Reference
Prediction     0     1
         0 16065  3157
         1   968  2430

               Accuracy : 0.8176          
                 95% CI : (0.8125, 0.8227)
# Predict with test dataframe
a=predict(fit,xtest1,type="response")

: 'newdata' had 7541 rows but variables found have 22620 rows 
2: In predict.lm(object, newdata, se.fit, scale = 1, type = ifelse(type ==  :
  prediction from a rank-deficient fit may be misleading
> 

I also tried

   names(xtest1)=names(xtrain1) and
    a=predict(fit,xtest1,type="response")

They were the same anyway but I get the same error. This is an issue that is very counter intuitive. Please help...

Tinniam V. Ganesh
  • 1,979
  • 6
  • 26
  • 51

1 Answers1

0

I changed the fit to use the 'data' instead of a matrix and the y column, and now it works

adult1 <- dummy.data.frame(adult, sep = ".")

train_idx <- trainTestSplit(adult1,trainPercent=75,seed=1111)
train <- adult1[train_idx, ]
test <- adult1[-train_idx, ]

fit=glm(salary~.,family=binomial,data=train)
a=predict(fit,newdata=train,type="response")
b=ifelse(a>0.5,1,0)
confusionMatrix(b,train$salary)


m=predict(fit,newdata=test,type="response")
Tinniam V. Ganesh
  • 1,979
  • 6
  • 26
  • 51