0

We are using Bayesian models to predict NBA all-star selection based on different performance stats. There are 24 all-stars selected each year. Unfornatunely, we can't find a way to make our prediction model understand this. It is either predicting too few or too many all-stars. All-star is included in the data as a binary column (1 = if the player makes the all-star team, 0 = if the player do not)

example of the code:

predict(fitBN, response = targetVar, newdata = testSet, predictors = names(test)[-col.target.var])

Is there any way or arguments to force the predict()-function to predict exactly 24 all-star players?

milanDD
  • 123
  • 1
  • 11
  • 3
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Does your data have missing values? – MrFlick May 17 '19 at 14:51
  • It seems that your model is not structured properly. You can't predict in a binary sense if a player will be selected to the All Star team, you can only project out the probability that he will be selected conditioned on the values of the input data set. That output would give you a rank order of players from highest to lowest based on the probability of being selected. You then select the top 24 to project out your team. That is the simplest case of course since secondary selection factors related to player count by position come into play. – SteveM May 17 '19 at 14:57
  • @SteveM Anyway to display these probabilities from the predict()-function? – milanDD May 17 '19 at 15:16

0 Answers0