keras: how to save the training history attribute of the history object

Question

In Keras, we can return the output of model.fit to a history as follows:

 history = model.fit(X_train, y_train, 
                     batch_size=batch_size, 
                     nb_epoch=nb_epoch,
                     validation_data=(X_test, y_test))

Now, how to save the history attribute of the history object to a file for further uses (e.g. draw plots of acc or loss against epochs)?

If it helps, you can as well use the `CSVLogger()` callback of keras as described here: https://keras.io/callbacks/#csvlogger — swiss_knight, Apr 15 '19 at 12:30
Does anyone recommend a method to save the history object returned by `fit`? It contains useful info in `.params` attribute which I would like to keep too. Yes, I can save the `params` & `history` attributes separately or combine in say a dict, but I'm interested in a simple way to save the entire `history` object. — user3731622, Oct 18 '19 at 21:01

score 117 · Accepted Answer · edited Oct 13 '22 at 22:54

What I use is the following:

with open('/trainHistoryDict', 'wb') as file_pi:
    pickle.dump(history.history, file_pi)

In this way I save the history as a dictionary in case I want to plot the loss or accuracy later on. Later, when you want to load the history again, you can use:

with open('/trainHistoryDict', "rb") as file_pi:
    history = pickle.load(file_pi)

Why choose pickle over json?

The comment under this answer accurately states:

[Storing the history as json] does not work anymore in tensorflow keras. I had issues with: TypeError: Object of type 'float32' is not JSON serializable.

There are ways to tell json how to encode numpy objects, which you can learn about from this other question, so there's nothing wrong with using json in this case, it's just more complicated than simply dumping to a pickle file.

swiss_knight · Answer 2 · 2022-05-14T19:24:57.427

61

Another way to do this:

As history.history is a dict, you can convert it as well to a pandas DataFrame object, which can then be saved to suit your needs.

Step by step:

import pandas as pd

# assuming you stored your model.fit results in a 'history' variable:
history = model.fit(x_train, y_train, epochs=10)

# convert the history.history dict to a pandas DataFrame:     
hist_df = pd.DataFrame(history.history) 

# save to json:  
hist_json_file = 'history.json' 
with open(hist_json_file, mode='w') as f:
    hist_df.to_json(f)

# or save to csv: 
hist_csv_file = 'history.csv'
with open(hist_csv_file, mode='w') as f:
    hist_df.to_csv(f)

edited May 14 '22 at 19:24

answered Apr 29 '19 at 10:14

swiss_knight

5,787
8
50
92

How would you re-load it? – jtlz2 Jul 23 '21 at 12:33
you can just read it as a dataframe using pd.read_csv('history.csv') – Mohammed Nadeem Oct 31 '21 at 17:30
1

I used this one which is more easier to me. – Caner Erden Jan 27 '22 at 05:55
1

Sounds good. A .csv is more universal than .pkl. I can load it in R this way or even open it in Excel, if I simply want to have a look what's in it. – Manuel Popp Jun 29 '22 at 12:06

score 36 · Answer 3 · edited Apr 09 '21 at 08:42

36

The easiest way:

Saving:

np.save('my_history.npy',history.history)

Loading:

history=np.load('my_history.npy',allow_pickle='TRUE').item()

Then history is a dictionary and you can retrieve all desirable values using the keys.

edited Apr 09 '21 at 08:42

desertnaut

57,590
26
140
166

answered Apr 20 '20 at 17:46

Arman

459
4
4

This should be the top answer imo – Gautam Chettiar Jan 13 '23 at 10:20

score 18 · Answer 4 · answered Nov 01 '18 at 12:16

18

The model history can be saved into a file as follows

import json
hist = model.fit(X_train, y_train, epochs=5, batch_size=batch_size,validation_split=0.1)
with open('file.json', 'w') as f:
    json.dump(hist.history, f)

answered Nov 01 '18 at 12:16

Ashok Kumar Jayaraman

2,887
2
32
40

19

this does not work anymore in tensorflow keras. I had issues with: TypeError: Object of type 'float32' is not JSON serializable. I had to use json.dump(str(hist.history, f)). – BraveDistribution Aug 16 '19 at 15:07
@BraveDistribution Keep in mind that you can specify encoders for `json` like in [this answer](https://stackoverflow.com/q/26646362/11659881). So while this exact code does not work, `json` is still viable if you specify an encoder using the `cls` argument. – Kraigolas Oct 13 '22 at 22:58

score 12 · Answer 5 · answered Dec 09 '16 at 18:39

A history objects has a history field is a dictionary which helds different training metrics spanned across every training epoch. So e.g. history.history['loss'][99] will return a loss of your model in a 100th epoch of training. In order to save that you could pickle this dictionary or simple save different lists from this dictionary to appropriate file.

score 7 · Answer 6 · edited Oct 16 '20 at 00:17

7

I came across the problem that the values inside of the list in keras are not json seriazable. Therefore I wrote this two handy functions for my use cause.

import json,codecs
import numpy as np
def saveHist(path,history):
    
    new_hist = {}
    for key in list(history.history.keys()):
        new_hist[key]=history.history[key]
        if type(history.history[key]) == np.ndarray:
            new_hist[key] = history.history[key].tolist()
        elif type(history.history[key]) == list:
           if  type(history.history[key][0]) == np.float64:
               new_hist[key] = list(map(float, history.history[key]))
            
    print(new_hist)
    with codecs.open(path, 'w', encoding='utf-8') as file:
        json.dump(new_hist, file, separators=(',', ':'), sort_keys=True, indent=4) 

def loadHist(path):
    with codecs.open(path, 'r', encoding='utf-8') as file:
        n = json.loads(file.read())
    return n

where saveHist just needs to get the path to where the json file should be saved, and the history object returned from the keras fit or fit_generator method.

edited Oct 16 '20 at 00:17

Chris F Carroll

11,146
3
53
61

answered Jan 08 '19 at 13:02

Kev1n91

3,553
8
46
96

1

Thank you for offering the code to reload. What would also have been nice would be a way to append additional history (i.e. from `model.fit()`) to the reloaded history. I'm researching that now. – Mark Cramer Feb 28 '19 at 19:43
@MarkCramer shouldn't it be something along the lines of saving all of the parameters from the original history object, reloading the history object and using it to set up the model, running fit on the reloaded model and capturing the results in a new history object, and then concatenating the info inside the new history object into the original history object? – jschabs Mar 14 '19 at 17:09
@jschabs, yes, it's like that, but unfortunately it's complicated. I've figured it out so I think I'll offer an answer. – Mark Cramer Mar 14 '19 at 23:59
gives `newchars, decodedbytes = self.decode(data, self.errors)` for me – Mubeen Khan Sep 17 '19 at 15:43

score 3 · Answer 7 · edited Jul 23 '21 at 12:24

I'm sure there are many ways to do this, but I fiddled around and came up with a version of my own.

First, a custom callback enables grabbing and updating the history at the end of every epoch. In there I also have a callback to save the model. Both of these are handy because if you crash, or shutdown, you can pick up training at the last completed epoch.

class LossHistory(Callback):
    
    # https://stackoverflow.com/a/53653154/852795
    def on_epoch_end(self, epoch, logs = None):
        new_history = {}
        for k, v in logs.items(): # compile new history from logs
            new_history[k] = [v] # convert values into lists
        current_history = loadHist(history_filename) # load history from current training
        current_history = appendHist(current_history, new_history) # append the logs
        saveHist(history_filename, current_history) # save history from current training

model_checkpoint = ModelCheckpoint(model_filename, verbose = 0, period = 1)
history_checkpoint = LossHistory()
callbacks_list = [model_checkpoint, history_checkpoint]

Second, here are some 'helper' functions to do exactly the things that they say they do. These are all called from the LossHistory() callback.

# https://stackoverflow.com/a/54092401/852795
import json, codecs

def saveHist(path, history):
    with codecs.open(path, 'w', encoding='utf-8') as f:
        json.dump(history, f, separators=(',', ':'), sort_keys=True, indent=4) 

def loadHist(path):
    n = {} # set history to empty
    if os.path.exists(path): # reload history if it exists
        with codecs.open(path, 'r', encoding='utf-8') as f:
            n = json.loads(f.read())
    return n

def appendHist(h1, h2):
    if h1 == {}:
        return h2
    else:
        dest = {}
        for key, value in h1.items():
            dest[key] = value + h2[key]
        return dest

After that, all you need is to set history_filename to something like data/model-history.json, as well as set model_filename to something like data/model.h5. One final tweak to make sure not to mess up your history at the end of training, assuming you stop and start, as well as stick in the callbacks, is to do this:

new_history = model.fit(X_train, y_train, 
                     batch_size = batch_size, 
                     nb_epoch = nb_epoch,
                     validation_data=(X_test, y_test),
                     callbacks=callbacks_list)

history = appendHist(history, new_history.history)

Whenever you want, history = loadHist(history_filename) gets your history back.

The funkiness comes from the json and the lists but I wasn't able to get it to work without converting it by iterating. Anyway, I know that this works because I've been cranking on it for days now. The pickle.dump answer at https://stackoverflow.com/a/44674337/852795 might be better, but I don't know what that is. If I missed anything here or you can't get it to work, let me know.

Thanks! Very useful! You can speed this up a tiny bit by storing the history in memory instead of loading the history from file after every epoch, however given that this load / save is a very small amount of time compared to actual training, I think its okay to keep the code as is. — ias, Dec 29 '19 at 23:39

score 1 · Answer 8 · edited Apr 23 '21 at 14:35

1

You can save History attribute of tf.keras.callbacks.History in .txt form

with open("./result_model.txt",'w') as f:
    for k in history.history.keys():
        print(k,file=f)
        for i in history.history[k]:
            print(i,file=f)

edited Apr 23 '21 at 14:35

Frightera

4,773
2
13
28

answered Apr 23 '21 at 12:56

Ankur

95
1
3

score 0 · Answer 9 · answered Jun 18 '20 at 17:39

0

The above answers are useful when saving history at the end of the training process. If you want to save the history during the training, the CSVLogger callback will be helpful.

Below code saves the model weight and history training in form of a datasheet file log.csv.

model_cb = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path)
history_cb = tf.keras.callbacks.CSVLogger('./log.csv', separator=",", append=False)

history = model.fit(callbacks=[model_cb, history_cb])

answered Jun 18 '20 at 17:39

tngotran

361
2
7

How does one re-load it? – jtlz2 Jul 23 '21 at 12:30
CSVLogger It's not saving the history object during training but at the end of the training. So, if the training is interrupted the history is lost. Any idea how to fix it? – Al_Mt Jul 02 '22 at 00:05

score 0 · Answer 10 · answered Mar 10 '22 at 23:09

Here is a callback that pickles the logs into a file. Provide the model file path when instantiating the callback obj; this will create an associated file - given model path '/home/user/model.h5', the pickled path '/home/user/model_history_pickle'. Upon reloading the model, the callback will continue from the epoch that it left off at.


    import os
    import re
    import pickle
    #
    from tensorflow.keras.callbacks import Callback
    from tensorflow.keras import backend as K

    class PickleHistoryCallback(Callback):
        def __init__(self, path_file_model, *args, **kwargs):
            super().__init__(*args, **kwargs)
            self.__path_file_model = path_file_model
            #
            self.__path_file_history_pickle = None
            self.__history = {}
            self.__epoch = 0
            #
            self.__setup()
        #
        def __setup(self):
            self.__path_file_history_pickle = re.sub(r'\.[^\.]*$', '_history_pickle', self.__path_file_model)
            #
            if (os.path.isfile(self.__path_file_history_pickle)):
                with open(self.__path_file_history_pickle, 'rb') as fd:
                    self.__history = pickle.load(fd)
                    # Start from last epoch
                    self.__epoch = self.__history['e'][-1]
            #
            else:
                print("Pickled history file unavailable; the following pickled history file creation will occur after the first training epoch:\n\t{}".format(
                    self.__path_file_history_pickle))
        #
        def __update_history_file(self):
            with open(self.__path_file_history_pickle, 'wb') as fd:
                pickle.dump(self.__history, fd)
        #
        def on_epoch_end(self, epoch, logs=None):
            self.__epoch += 1
            logs = logs or {}
            #
            logs['e'] = self.__epoch
            logs['lr'] = K.get_value(self.model.optimizer.lr)
            #
            for k, v in logs.items():
                self.__history.setdefault(k, []).append(v)
            #
            self.__update_history_file()

pckl_hstry_c = PickleHistoryCallback(path_file_model); list_callbacks += [pckl_hstry_c]; history = model.fit( X_train, Y_train, validation_data=(X_validation, Y_validation), verbose=0, callbacks=list_callbacks ); — QuintoViento, Mar 10 '22 at 23:19

keras: how to save the training history attribute of the history object

10 Answers10

Why choose pickle over json?

Another way to do this:

Linked