neural network with multiple outputs in sklearn

Question

I'm trying to build a neural network to predict the probability of each tennis player winning a service point when they play against each other. For inputs I would use last N matches that each player played, taking the ranking difference against his opponent and the actual probability of winning a point he had in the match.

For example, looking at only 2 matches for each player, one input would be

i=[-61, 25, 0.62, 0.64, 2, -35, 0.7, 0.65]

First 4 numbers are for 1st player (ranking differences and probabilities he had), other 4 for second. Output would be

o=[0.65, 0.63]

So training inputs would be X=[i1, i2, i3,...] and outputs y=[o1, o2, o3,...]

I have a couple of newbie questions:

is it necessary to normalize inputs (ranks and probabilities respectively) across the entire dataset?
when I try to run this in python it says

ValueError: Multioutput target data is not supported with label binarization

Can I make MLPClassifier work with 2 outputs?

EDIT: added some code

from sklearn.neural_network import MLPClassifier
clf = MLPClassifier(solver='lbfgs', alpha=1e-5,
                   hidden_layer_sizes=(5, 2), random_state=1)
X=[[-61, 25, 0.62, 0.64, 2, -35, 0.7, 0.65], [2,-5,0.58,0.7,-3,-15,0.65,0.52] ]
y=[ [0.63, 0.64], [0.58,0.61] ]
clf.fit(X,y)

that code return the mentioned error. data isn't normalized here, but let's ignore that for now.

Maximilian Peters · Accepted Answer · 2017-06-14T23:01:25.783

Your first question is answered here in detail: Why do we have to normalize the input for an artificial neural network? In short, yes, just normalize the values, it makes life easier.

The 2nd question is covered here:

MLPClassifier supports multi-class classification by applying Softmax as the output function.

If you can add some of your code to the question, the answer could be more detailed.

Edit

After reading the question again, more carefully, I realized that you are trying to use a classifier function, i.e. you are trying to apply labels to your input data. This means that the function is expecting binary output.

You are probably looking for a Multi-layer Perceptron regressor which will give continuous output values.

from sklearn.neural_network import MLPRegressor
clf = MLPRegressor(solver='lbfgs', alpha=1e-5,
                   hidden_layer_sizes=(5, 2), random_state=1)
X=[[-61, 25, 0.62, 0.64, 2, -35, 0.7, 0.65], [2,-5,0.58,0.7,-3,-15,0.65,0.52] ]
y=[ [0.63, 0.64], [0.58,0.61] ]
clf.fit(X,y)

MLPRegressor(activation='relu', alpha=1e-05, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(5, 2), learning_rate='constant',
       learning_rate_init=0.001, max_iter=200, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=1, shuffle=True,
       solver='lbfgs', tol=0.0001, validation_fraction=0.1, verbose=False,
       warm_start=False)

@DoctorEvil: Thanks, now it is clearer and I updated the answer. — Maximilian Peters, Jun 14 '17 at 16:53

neural network with multiple outputs in sklearn

1 Answers1