I'm trying to build a neural network to predict the probability of each tennis player winning a service point when they play against each other. For inputs I would use last N
matches that each player played, taking the ranking difference against his opponent and the actual probability of winning a point he had in the match.
For example, looking at only 2 matches for each player, one input would be
i=[-61, 25, 0.62, 0.64, 2, -35, 0.7, 0.65]
First 4 numbers are for 1st player (ranking differences and probabilities he had), other 4 for second. Output would be
o=[0.65, 0.63]
So training inputs would be X=[i1, i2, i3,...]
and outputs y=[o1, o2, o3,...]
I have a couple of newbie questions:
- is it necessary to normalize inputs (ranks and probabilities respectively) across the entire dataset?
- when I try to run this in python it says
ValueError: Multioutput target data is not supported with label binarization
Can I make MLPClassifier work with 2 outputs?
EDIT: added some code
from sklearn.neural_network import MLPClassifier
clf = MLPClassifier(solver='lbfgs', alpha=1e-5,
hidden_layer_sizes=(5, 2), random_state=1)
X=[[-61, 25, 0.62, 0.64, 2, -35, 0.7, 0.65], [2,-5,0.58,0.7,-3,-15,0.65,0.52] ]
y=[ [0.63, 0.64], [0.58,0.61] ]
clf.fit(X,y)
that code return the mentioned error. data isn't normalized here, but let's ignore that for now.