I am new to Array programming and found it difficult to interpret the sklearn.metrics label_ranking_average_precision_score function. Need your help to understand the way it is calculated and any appreciate any tips to learn Numpy Array Programming.
Generally, I know Precision is
((True Positive) / (True Positive + False Positive))
The reason why I am asking is this is that I stumbled upon Kaggle Competition for Audio Tagging and came across this post that says they are using LWRAP function for calculating the score when there are more than one correct label in the response. I started to read to know how this score is calculated and found it difficult to interpret. My two difficulties are
1) Interpreting the Math function from documentation, I am not sure how ranks are used in score calculation
2) Interpreting Numpy array operations from the code
The function that I am reading is from Google Collab document then I tried reading the documentation at sklearn but couldn't understand properly.
Code for one sample calculation is
# Core calculation of label precisions for one test sample.
def _one_sample_positive_class_precisions(scores, truth):
"""Calculate precisions for each true class for a single sample.
Args:
scores: np.array of (num_classes,) giving the individual classifier scores.
truth: np.array of (num_classes,) bools indicating which classes are true.
Returns:
pos_class_indices: np.array of indices of the true classes for this sample.
pos_class_precisions: np.array of precisions corresponding to each of those
classes.
"""
num_classes = scores.shape[0]
pos_class_indices = np.flatnonzero(truth > 0)
# Only calculate precisions if there are some true classes.
if not len(pos_class_indices):
return pos_class_indices, np.zeros(0)
# Retrieval list of classes for this sample.
retrieved_classes = np.argsort(scores)[::-1]
# class_rankings[top_scoring_class_index] == 0 etc.
class_rankings = np.zeros(num_classes, dtype=np.int)
class_rankings[retrieved_classes] = range(num_classes)
# Which of these is a true label?
retrieved_class_true = np.zeros(num_classes, dtype=np.bool)
retrieved_class_true[class_rankings[pos_class_indices]] = True
# Num hits for every truncated retrieval list.
retrieved_cumulative_hits = np.cumsum(retrieved_class_true)
# Precision of retrieval list truncated at each hit, in order of pos_labels.
precision_at_hits = (
retrieved_cumulative_hits[class_rankings[pos_class_indices]] /
(1 + class_rankings[pos_class_indices].astype(np.float)))
return pos_class_indices, precision_at_hits