0

When I train a SGDClassifier in scikit-learn, I can print out the loss value from every iteration (setting verbosity). How to store the values into an array?

J.K.
  • 555
  • 4
  • 8

1 Answers1

1

Modifying the answer from this post.

import numpy as np
from io import StringIO
import matplotlib.pyplot as plt
from sklearn.linear_model import SGDClassifier
from tensorflow.keras.datasets import mnist

(x_tr, y_tr), (x_te, y_te) = mnist.load_dataset()
x_tr, x_te = x_tr.reshape(-1, 784), x_te.reshape(-1, 784)

Intercept the printed output by the SGDClassifier

old_stdout = sys.stdout
sys.stdout = mystdout = StringIO()

Set the model to print its output by setting verbose to 1.

clf = SGDClassifier(verbose=1)
clf.fit(x_tr, y_tr)

Get the output of SGDClassifier verbosity

sys.stdout = old_stdout
loss_history = mystdout.getvalue()

Create a list to store the loss values

loss_list = []

Append the loss values printed which is stored in loss_history

for line in loss_history.split('\n'):
    if(len(line.split("loss: ")) == 1):
        continue
    loss_list.append(float(line.split("loss: ")[-1]))

Just to show the graph

plt.figure()
plt.plot(np.arange(len(loss_list)), loss_list)
plt.xlabel("Time in epochs"); plt.ylabel("Loss")
plt.show()

To save the loss values to an array,

loss_list = np.array(loss_list)
afagarap
  • 650
  • 2
  • 10
  • 22