4

This seems most related to: How to get the probability per instance in classifications models in spark.mllib

I'm doing a classification task with spark ml, building a MultilayerPerceptronClassifier. Once I build a model, I can get a predicted class given an input vector, but I can't get the probability for each output class. The above listing indicates that NaiveBayesModel supports this functionality as of Spark 1.5.0 (using a predictProbabilities method). I would like to get at this functionality for the MLPC. Is there a way I can hack at it to get my probabilities? Will it be included in 1.6.2?

Community
  • 1
  • 1
mattwise
  • 1,464
  • 1
  • 10
  • 20
  • In Spark ML - [Spark MultilayerPerceptronClassifier Class Probabilities](https://stackoverflow.com/q/54545639/10465355) – 10465355 Feb 06 '19 at 10:38

2 Answers2

1

If you take a look at this line in the MLPC source code, you can see that the MLPC is working from an underlying TopologyModel which provides the .predict method I'm looking for. The MLPC decodes the resulting Vector into a single label.

I'm able to use the trained MLPC model to create a new TopologyModel using its weights:

MultilayerPerceptronClassifier trainer = new MultilayerPerceptronClassifier()...;
MultilayerPerceptronClassificationModel model = trainer.fit(trainingData);
TopologyModel topoModel = FeedForwardTopology.multiLayerPerceptron(model.layers(), true).getInstance(model.weights());
mattwise
  • 1,464
  • 1
  • 10
  • 20
  • The question seems pretty straightforward. After training your classifier you are expecting to get a model that is supposed to give you all the support to do further classifications – ncaralicea May 31 '17 at 18:35
0

I think the short answer is No.

The MultilayerPerceptronClassifier is not probabilistic. When the weights (and any biases) are set after training, the classification for a given input will always be the same.

What you're really asking, I think, is "if I were to tweak the weights by certain random disturbances of a given magnitude, how likely would the classification be the same as without the tweaks?"

You could do an ad hoc probability calculation by re-training the perceptron (with different, randomly chosen starting conditions) and get some idea of the probability of various classifications.

But I don't think this is really part of the expected behavior of a MLPC.

Phasmid
  • 923
  • 7
  • 19
  • Not quite what I'm trying to get at, but maybe my choice of the term "probability" is incorrect. Maybe what I want is something more like "confidence". The classifier has `n` output nodes for each of the output classes. The weights from the previous layer nodes will add up to some value in the output node. The output node with the highest value is the predicted output class. With access to those values directly, I can tell the difference between a selected class that has very high confidence, compared to one that was really close between two possible classes. – mattwise Mar 15 '16 at 21:26