Getting message - Data cannot have more levels than the reference - for titanic dataset

Question

split.data = function(data, p = 0.7, s = 666){   
    set.seed(s)   
    index = sample(1:dim(data)[1])  
    train = data[index[1:floor(dim(data)[1] * p)], ]  
    test = data[index[((ceiling(dim(data)[1] * p)) + 1):dim(data)[1]], ]  
    return(list(train = train, test = test))
}

allset= split.data(train.data, p = 0.7)  
trainset = allset$train  
testset = allset$test

train.ctree = ctree(Survived ~ Pclass + Sex + Age + SibSp + Fare
                + Parch + Embarked, data=trainset)  
ctree.predict = predict(train.ctree, testset)
confusionMatrix(ctree.predict, testset$Survived)

It is a code to predict passenger survival from Titanic dataset. In the training set, the number of levels is not matching with the test test. The probabilities are not rounded off and exist as separate levels.

You should provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input data so we can run the code to see what's happening. Make sure to explicitly include any non-default packages you are using in your code. I assume you're using `party` for `ctree`? — MrFlick, Oct 25 '15 at 18:07

Getting message - Data cannot have more levels than the reference - for titanic dataset

0 Answers0