0

I am attempting to generate a confusion matrix with predicted data and actual data. I receive an error that the levels are not equal and I receive the error when both variables are read as factors. When I checked the levels I believe the issue is because the test data has many repeated values and thus a lower number of levels than the predicted values which are all unique. Is there a way to force the level of test data such that it will be equal to the predictions?

confusionMatrix(as.factor(sale.pred),as.factor(housing.test.df$SalePrice))

sale.pred are the forecasted values and housing.test.df$SalePrice are the actual values. As stated, sale.pred has no duplicate values and so its level is equal to the number of rows but housing.test.df$SalePrice has duplicate values and so its number of levels is < n as the number of rows.

fabla
  • 1,806
  • 1
  • 8
  • 20
B Bow
  • 9
  • Check the following previously asked questions https://stackoverflow.com/questions/30002013/error-in-confusion-matrix-the-data-and-reference-factors-must-have-the-same-nu https://stackoverflow.com/questions/24801452/error-in-confusionmatrix-the-data-and-reference-factors-must-have-the-same-numbe – Nareman Darwish Jan 04 '20 at 20:09
  • Read https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example and into the https://github.com/tidyverse/reprex package – Bruno Jan 04 '20 at 20:34
  • I tried manually setting the levels between 60,000 and 755000 which seems to be the range sale.pred <- predict(reg, housing.pca) sale.pred <- as.factor(sale.pred) actual <- as.factor(housing.test.df$SalePrice) levels(actual) <- c(60000:755000) levels(sale.pred)<-c(60000:755000) confusionMatrix(sale.pred,actual) but received the error Error in table(data, reference, dnn = dnn, ...) : attempt to make a table with >= 2^31 elements – B Bow Jan 04 '20 at 20:43

0 Answers0