0

I use the caret package with multi-layer perception.

My dataset consists of a labelled output value, which can be either A,B or C. The input vector consists of 4 variables.

I use the following lines of code to calculate the class probabilities for each input value:

fit <- train(device~.,data=dataframetrain[1:100,], method="mlp",
             trControl=trainControl(classProbs=TRUE))
(p=(predict(fit,newdata=dataframetest,type=("prob"))))

I thought that the class probabilities for each record must sum up to one. But I get the following:

rowSums(p)
#        1        2        3        4        5        6        7        8 
# 1.015291 1.015265 1.015291 1.015291 1.015291 1.014933 1.015011 1.015291 
#        9       10       11       12       13       14       15       16 
# 1.014933 1.015206 1.015291 1.015291 1.015291 1.015224 1.015011 1.015291 

Can anybody help me because I don't know what I did wrong.

karmabob
  • 105
  • 1
  • 6

2 Answers2

1

There's probably nothing wrong, it just seems that caret returns the values of the neurons in the output layer without converting them to probabilities (correct me if I'm wrong). When using the RSNNS::mlp function outside of caret the rows of the predictions also don't sum to one.

Since all output neurons have the same activation function the outputs can be converted to probabilities by dividing the predictions by the respective row sum, see this question.

This behavior seems to be true when using method = "mlp" or method = "mlpWeightDecay" but when using method = "nnet" the predictions do sum to one.

Example:

library(RSNNS)

data(iris)
#shuffle the vector
iris <- iris[sample(1:nrow(iris),length(1:nrow(iris))),1:ncol(iris)]
irisValues <- iris[,1:4]
irisTargets <- iris[,5]
irisTargetsDecoded <- decodeClassLabels(irisTargets)
iris2 <- splitForTrainingAndTest(irisValues, irisTargetsDecoded, ratio=0.15)
iris2 <- normTrainingAndTestSet(iris2)

set.seed(432)
model <- mlp(iris2$inputsTrain, iris2$targetsTrain, 
             size=5, learnFuncParams=c(0.1), maxit=50, 
             inputsTest=iris2$inputsTest, targetsTest=iris2$targetsTest)

predictions <- predict(model,iris2$inputsTest)
head(rowSums(predictions))
# 139        26        17       104        54        82 
# 1.0227419 1.0770722 1.0642565 1.0764587 0.9952268 0.9988647 

probs <- predictions / rowSums(predictions)
head(rowSums(probs))
# 139  26  17 104  54  82 
# 1   1   1   1   1   1 

# nnet example --------------------------------------
library(caret)
training <- sample(seq_along(irisTargets), size = 100, replace = F)
modelCaret <- train(y = irisTargets[training], 
                    x = irisValues[training, ],
                    method = "nnet")
predictionsCaret <- predict(modelCaret, 
                            newdata = irisValues[-training, ],
                            type = "prob")
head(rowSums(predictionsCaret))
# 122 100  89 134  30  86 
# 1   1   1   1   1   1 
Community
  • 1
  • 1
thie1e
  • 3,588
  • 22
  • 22
  • I had already tried to solve the problem by using the RSNNS toolpack. But I received the following massage: 'this application has requested the runtime to terminate it in an unusual way'. after that it closses R. Therefore, I tried to use the caret toolpack. Do you maybe know whether there is something else that I can trie. Instead of MLP, I also want to use RBF. – karmabob May 21 '15 at 08:56
  • Without a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) I don't think we can tell what's wrong there. If you have to use `mlp` outside of `caret` make sure to format the data correctly, as in the example above (from `?mlp`). After all, it looks like `caret` worked for your application. `caret` [supports](http://topepo.github.io/caret/modelList.html) `method = "rbf"`, too. – thie1e May 21 '15 at 16:10
0

I don't know how much flexibility the caret package offers in these choices, but the standard way to make a neural net produce outputs which sum to one is to use the softmax function as the activation function in the output layer.

cfh
  • 4,576
  • 1
  • 24
  • 34
  • That's true, I should have mentioned softmax. `nnet` uses softmax by default for multiclass classification. I'm no expert on `RSNNS` but it looks like you can only choose from linear or logistic output units. The user can normally pass on such arguments within `train()`. – thie1e May 21 '15 at 15:54