TPOT P@K custom scorer

Question

I'm using (and loving) TPOT, but am having trouble implementing a custom P@K scorer. Assuming I'd like to have as many true positives as possible out of 100 predictions (or any number K), how would one code that? I've tried with the code below, but I keep getting an error that the pipeline has not been fitted yet, though there is no error with standard scorers.

def pak(actual, predicted):
    k = 100
    if len(predicted) > k:
        predicted = predicted[:k]
    score = 0.0
    num_hits = 0.0

    for i, p in enumerate(predicted):
        if p in actual and p not in predicted[:i]:
            num_hits += 1.0
            score += num_hits / (i + 1.0)
    if not actual:
        return 0.0
    return score / min(len(actual), k)


my_custom_scorer = make_scorer(pak, greater_is_better=True)

Are you getting the error when running just this code? If not, then sharing the code where you are getting the error would help to determine where the issues might be. — MichaelD, Aug 11 '19 at 21:29
Could you please provide some additional information on what your scorer is supposed to do. I guess there is a problem with that. — pythonic833, Aug 14 '19 at 22:00

pythonic833 · Answer 1 · 2019-08-14T22:15:56.400

Implementing the algorithm to maximize number of true positives

I would not recommend to do this (see the discussion in the end) but based on what I understood is that you want to maximize the number of true positives. Therefore you want to create a custom scorer and use TPOT to optimize the true positive rate. I optimized your function since it is depending on a given number k. This can be avoided if you simply calculate the true positive rate. I used an example dataset from sklearn, which of course can be replaced with any other.

import numpy as np
import sklearn
from sklearn.metrics import make_scorer
import tpot
from tpot import TPOTClassifier
from sklearn.model_selection import train_test_split

def maximize_true_pos(y, y_pred):
    # true positives are marked with ones, all others with zeros
    true_pos = np.where((y==1) & (y_pred == 1), 1, 0)
    # sum true positives
    num_true_pos = np.sum(true_pos)
    # determine the true positive rate, how many of the positives were found?
    true_pos_div_total_tp = num_true_pos/np.sum(y)
    return true_pos_div_total_tp

iris = sklearn.datasets.load_breast_cancer()
# create the custom scorer 
max_true_pos_scorer = make_scorer(maximize_true_pos)
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target,
                                                    train_size=0.75, test_size=0.25)
X_train.shape, X_test.shape, y_train.shape, y_test.shape

tpot = TPOTClassifier(verbosity=2, max_time_mins=2, scoring=max_true_pos_scorer)
tpot.fit(X_train, y_train)
y_pred = tpot.predict(X_test)

Discussion of results and methodology

Now let's understand what was optimized here by looking at y_pred.

y_pred
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

Since we only wanted to optimize the number of true positives, the algorithm learnt that false positives are not punished and therefore set everything to class 1 (though y_true is not always 1, therefore accuracy < 1). Depending on your use case recall (how many positively labeled cases are found) or precision (how many of positively labeled cases are relevant) are better metrics than simply getting the algorithm to learn that it should label everything as positive.

To use precision or recall (you probably know that but I still put it in here for the sake of completeness) one can simply give "precision" or "recall" as the scoring argument in the following fashion:

TPOTClassifier(scoring='recall')

Maybe I'm wrong, but my understanding is that precision at k doesn't just maximize the true positive rate. One can maximize the true positive rate just by setting all predictions to "1". Precision at k asks how many true positives v false positives there are at k and thus rewards true positives *and* punishes false positives at k. Or am I missing something? — Don, Aug 16 '19 at 23:05
Nope, you are right but you asked for getting **many true positives as possible out of 100 predictions (or any number K)** and this is what my custom scorer does. If you want to get the highest precision for the top K items with the highest class 1 probability just tell me. I can write a scorer for this too — pythonic833, Aug 18 '19 at 20:26
I think that can't be right. If I have a sample of 2000, 25% of which are positive, and I set K=100, how does your algorithm select the 100 with the highest level of true positives? Won't it just get (on average) 25% true positives? — Don, Dec 06 '19 at 16:46

TPOT P@K custom scorer

1 Answers1

Implementing the algorithm to maximize number of true positives

Discussion of results and methodology