My aim is to prove whether there is overfitting or underfitting. However, when I calculate the learning curves (graphically depict how a process is improved), the standard deviation of the cross-validation score is enormous.
My observation here is that after I change the cross-validation from Leave-One-Out Cross-Validation (LOOCV) to KFold, everything gets interpretable and normal, otherwise, with LOOCV, the standard deviation is high. I don´t understand why.
I choose LOOCV because my sample is very little. I want to use every sample as a test set until it reaches the end.
The second thing is, should I get the learning curves in the loop or outside?
X (data) and y (classes including only 0 and 1) is 1D dataset.
The code:
loo = LeaveOneOut()
lr_model = LogisticRegression()
y_pred = []
accuracy_scores = []
f1_scores = []
precision_scores = []
recall_scores = []
# loop through each fold in the LOOCV cross-validation
for train_index, test_index in loo.split(X):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
# fit the model on the training data
lr_model.fit(X_train, y_train)
# use the model to predict the labels for the test data
y_pred_fold = lr_model.predict(X_test)
y_pred.extend(y_pred_fold)
# calculate evaluation metrics for this fold
accuracy = accuracy_score(y_test, y_pred_fold)
f1 = f1_score(y_test, y_pred_fold, zero_division=0)
precision = precision_score(y_test, y_pred_fold, zero_division=0)
recall = recall_score(y_test, y_pred_fold, zero_division=0)
accuracy_scores.append(accuracy)
f1_scores.append(f1)
precision_scores.append(precision)
recall_scores.append(recall)
# calculate the overall evaluation metrics
accuracy = accuracy_score(y, y_pred)
f1 = f1_score(y, y_pred)
precision = precision_score(y, y_pred)
recall = recall_score(y, y_pred)
print("Overall Accuracy: %.2f%%" % accuracy)
print("Overall F1 Score: %.2f" % f1)
print("Overall Precision: %.2f" % precision)
print("Overall Recall: %.2f" % recall)
# Compute learning curve using LOOCV
train_sizes, train_scores, test_scores = learning_curve(lr_model, X, y, cv=loo, scoring='accuracy', n_jobs=-1)
# Compute mean and standard deviation of training and test scores
train_scores_mean = np.mean(train_scores, axis=1)
train_scores_std = np.std(train_scores, axis=1)
test_scores_mean = np.mean(test_scores, axis=1)
test_scores_std = np.std(test_scores, axis=1)
# Plot learning curve with shaded standard deviation regions
plt.figure()
plt.title('Learning Curve')
plt.xlabel('Training Examples')
plt.ylabel('Accuracy')
plt.grid()