This SO post is about using a custom performance measurement function in the caret
package. You want to find the best prediction model, so you build several and compare them by calculating a single metric that is drawn from comparing the observation and the predicted value. There are default functions to calculate this metric, but you can also define your own metric-function. This custom functions must take obs and predicted values as input.
In classification problems (let's say only two classes) the predicted value is 0
or 1
. However, what I need to evaluate is also the probability calculated in the model. Is there any way to achieve this?
The reason is that there are applications where you need to know whether a 1
prediction is actually with a 99% probability or with a 51% probability - not just if the prediction is 1 or 0.
Can anyone help?
Edit
OK, so let me try to explain a little bit better. In the documentation of the caret
package under 5.5.5 (Alternate Performance Metrics) there is a description how to use your own custom performance function like so
fitControl <- trainControl(method = "repeatedcv",
number = 10,
repeats = 10,
## Estimate class probabilities
classProbs = TRUE,
## Evaluate performance using
## the following function
summaryFunction = twoClassSummary)
twoClassSummary
is the custom performance function in this example. The function provided here needs to take as input a dataframe or matrix with obs
and pred
. And here's the point - I want to use a function that does not take observerd and predicted, but observed and predicted probability.
One more thing:
Solutions from other packages are also welcome. The only thing I am not looking for is "This is how you write your own cross-validation function."