0

I am learning how to build an image classifier by creating a CNN in Keras. How do I prevent my program from starting over every time I start a new training session. Code is below including some commented out code of things that I have tried. Thanks!

I have tried to create checkpoints but the accuracy and loss still seems to reset. I have also tried to model.save() and model.save_weights() and then load them into a new model but my accuracy and loss still seems to start from the beginning.

                                 IMPORTS 
# import CIFAR10 data
from keras.datasets import cifar10

# import keras utils
import keras.utils as utils

# import Sequential model
from keras.models import Sequential

# import layers
from keras.layers import Dense, Dropout, Flatten
from keras.layers.convolutional import Conv2D, MaxPooling2D

# normalizes values in kernal
from keras.constraints import maxnorm

# import compiler optimizers
from keras.optimizers import SGD

# import keras checkpoint
from keras.callbacks import ModelCheckpoint

# import h5py
import h5py

                             END IMPORTS 

# load cifar10 data
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# format trains and tests to float32 and divide by 255.0
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# change y_train and y_test to utils categorical
y_train = utils.to_categorical(y_train)
y_test = utils.to_categorical(y_test)

# create labels array
labels = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

                         SEQUENTIAL MODEL 1 

model = Sequential()

model.add(Conv2D(filters=32, kernel_size=(3, 3), input_shape=(32, 32, 3), activation='relu', padding='same',
                 kernel_constraint=maxnorm(3)))

#### add second convolution layer - MaxPooling2d ####
# decreases image size from 32x32 to 16x16
# pool_size: finds max value in each 2x2 section of input
model.add(MaxPooling2D(pool_size=(2, 2)))

#### flatten features ####
# converts matrix to a 1 dimensional array
model.add(Flatten())

#### add third convolution layer - first Dense and feed into it ####
# creates prediction network
# units: 512 neurons for first layer
# activation: relu for accuracy
# kernal_constraint: maxnorm
model.add(Dense(units=512, activation='relu', kernel_constraint=maxnorm(3)))

#### add fourth convolution later - Dropout - kills some neurons - prevents overfitting - TRAINING ONLY ####
# improves reliability
# rate: 0.5 means kill half the neurons
# only to be used while training
model.add(Dropout(rate=0.5))

#### add fifth convolution layer - Second Dense layer - Creates 10 outputs because we have 10 categories ####
# produces output for each of the 10 categories
# units: 10 categories = 10 output units
# activation = 'softmax' because we are calculating the probabilities of each of the 10 categories (floats)
model.add(Dense(units=10, activation='softmax'))

############################## END SEQUENTIAL MODEL ##########################                 

############################## COMPILER ######################################
model.compile(optimizer=SGD(lr=0.01), loss='categorical_crossentropy', metrics=['accuracy'])

################################ END COMPILER ################################

################################ SAVE DATA ###################################

# saves the training data
model.save(filepath='model.h5')

# create model checkpoint based on best accuracy
#filepath = 'model.h5'
#checkpoint = ModelCheckpoint(filepath, monitor='val_accuracy', save_best_only='True',
                             #save_weights_only='False', mode='max', period=1)

#callbacks_list = [checkpoint]

# save weights
model.save_weights('model_weights.h5')

############################### END SAVE DATA ################################
model.fit(x=x_train, y=y_train, validation_split=0.1, epochs=20, batch_size=32, shuffle='True')

  • Thanks to all of you! The solution was to create a checkpoint to save weights which then creates a weights file. I then added a load_weights statement into my original model. Whenever I have a checkpoint file that is satisfactory I add the file path to load_weights and it resumes from where it left off. –  Oct 18 '19 at 15:51

3 Answers3

1

The issue in your case is you are saving the model before training it. You must first fit the model which does the training and then save the model. ALso attaching code with the change

from keras.datasets import cifar10

# import keras utils
import keras.utils as utils

# import Sequential model
from keras.models import Sequential

# import layers
from keras.layers import Dense, Dropout, Flatten
from keras.layers.convolutional import Conv2D, MaxPooling2D

# normalizes values in kernal
from keras.constraints import maxnorm

# import compiler optimizers
from keras.optimizers import SGD

# import keras checkpoint
from keras.callbacks import ModelCheckpoint

# import h5py
import h5py



# load cifar10 data
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# format trains and tests to float32 and divide by 255.0
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# change y_train and y_test to utils categorical
y_train = utils.to_categorical(y_train)
y_test = utils.to_categorical(y_test)

# create labels array
labels = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']



model = Sequential()

model.add(Conv2D(filters=32, kernel_size=(3, 3), input_shape=(32, 32, 3), activation='relu', padding='same',
                 kernel_constraint=maxnorm(3)))

#### add second convolution layer - MaxPooling2d ####
# decreases image size from 32x32 to 16x16
# pool_size: finds max value in each 2x2 section of input
model.add(MaxPooling2D(pool_size=(2, 2)))

#### flatten features ####
# converts matrix to a 1 dimensional array
model.add(Flatten())

#### add third convolution layer - first Dense and feed into it ####
# creates prediction network
# units: 512 neurons for first layer
# activation: relu for accuracy
# kernal_constraint: maxnorm
model.add(Dense(units=512, activation='relu', kernel_constraint=maxnorm(3)))

#### add fourth convolution later - Dropout - kills some neurons - prevents overfitting - TRAINING ONLY ####
# improves reliability
# rate: 0.5 means kill half the neurons
# only to be used while training
model.add(Dropout(rate=0.5))

#### add fifth convolution layer - Second Dense layer - Creates 10 outputs because we have 10 categories ####
# produces output for each of the 10 categories
# units: 10 categories = 10 output units
# activation = 'softmax' because we are calculating the probabilities of each of the 10 categories (floats)
model.add(Dense(units=10, activation='softmax'))

############################## END SEQUENTIAL MODEL ##########################                 

############################## COMPILER ######################################
model.compile(optimizer=SGD(lr=0.01), loss='categorical_crossentropy', metrics=['accuracy'])


model.fit(x=x_train, y=y_train, validation_split=0.1, epochs=20, batch_size=32, shuffle='True')

################################ END COMPILER ################################

################################ SAVE DATA ###################################

# saves the training data
model.save(filepath='model.h5')

# create model checkpoint based on best accuracy
#filepath = 'model.h5'
#checkpoint = ModelCheckpoint(filepath, monitor='val_accuracy', save_best_only='True',
                             #save_weights_only='False', mode='max', period=1)

#callbacks_list = [checkpoint]

# save weights
model.save_weights('model_weights.h5')

############################### END SAVE DATA ################################

Let me know if this works

ArunJose
  • 1,999
  • 1
  • 10
  • 33
  • Thanks! When I save the model after training it and then load the module again inside the same .py file it will continue from where it left off during the previous training. But when the program exits and I rerun the .py file from the command line it will start again from the beginning. –  Oct 17 '19 at 18:39
  • When you are rerunning this same py file you are again initialising the model by calling model=sequential() and defining the model again. So you are repeating the process of initialising the model and this is why training is starting from beginning – ArunJose Oct 18 '19 at 02:41
  • Is there a way to load the model in a separate py file without repeating the process of initializing? I have tried new_model = load_model('filepath') but it seems to cause it to restart. I suspect that it is also loading model=sequential() –  Oct 18 '19 at 13:20
  • After you Load the model Using load_model('filepath') also do model.load_weights('model_weights.h5') to load the model to the state you trained to – ArunJose Oct 18 '19 at 13:48
  • I created another py file and included this code: from keras.models import load_model # load model new_model = load_model('model.h5') # load weights new_model.load_weights(''model_weights.h5) I run my previous file containing the actual model. It trains and outputs an accuracy. I then run the new py file containing the code above expecting it to continue from where I left off but it still restarts. –  Oct 18 '19 at 14:50
  • I figured it out! Thanks so much! –  Oct 18 '19 at 15:48
1

You should save the model after training it and load the model using keras.models.load_model.

See the following snippet.

# import CIFAR10 data
from keras.datasets import cifar10

# import keras utils
import keras.utils as utils

# import Sequential model
from keras.models import Sequential

# import layers
from keras.layers import Dense, Dropout, Flatten
from keras.layers.convolutional import Conv2D, MaxPooling2D

# normalizes values in kernal
from keras.constraints import maxnorm

# import compiler optimizers
from keras.optimizers import SGD

# import keras checkpoint
from keras.callbacks import ModelCheckpoint

# import h5py
import h5py


from keras.models import load_model

# load cifar10 data
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# format trains and tests to float32 and divide by 255.0
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# change y_train and y_test to utils categorical
y_train = utils.to_categorical(y_train)
y_test = utils.to_categorical(y_test)

# create labels array
labels = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']


model = Sequential()

model.add(Conv2D(filters=32, kernel_size=(3, 3), input_shape=(32, 32, 3), activation='relu', padding='same',
                 kernel_constraint=maxnorm(3)))

#### add second convolution layer - MaxPooling2d ####
# decreases image size from 32x32 to 16x16
# pool_size: finds max value in each 2x2 section of input
model.add(MaxPooling2D(pool_size=(2, 2)))

#### flatten features ####
# converts matrix to a 1 dimensional array
model.add(Flatten())

#### add third convolution layer - first Dense and feed into it ####
# creates prediction network
# units: 512 neurons for first layer
# activation: relu for accuracy
# kernal_constraint: maxnorm
model.add(Dense(units=512, activation='relu', kernel_constraint=maxnorm(3)))

#### add fourth convolution later - Dropout - kills some neurons - prevents overfitting - TRAINING ONLY ####
# improves reliability
# rate: 0.5 means kill half the neurons
# only to be used while training
model.add(Dropout(rate=0.5))

#### add fifth convolution layer - Second Dense layer - Creates 10 outputs because we have 10 categories ####
# produces output for each of the 10 categories
# units: 10 categories = 10 output units
# activation = 'softmax' because we are calculating the probabilities of each of the 10 categories (floats)
model.add(Dense(units=10, activation='softmax'))

############################## END SEQUENTIAL MODEL ##########################                 

############################## COMPILER ######################################
model.compile(optimizer=SGD(lr=0.01), loss='categorical_crossentropy', metrics=['accuracy'])

################################ END COMPILER ################################

################################ SAVE DATA ###################################

model = load_model('model.h5')

# create model checkpoint based on best accuracy
#filepath = 'model.h5'
#checkpoint = ModelCheckpoint(filepath, monitor='val_accuracy', save_best_only='True',
                             #save_weights_only='False', mode='max', period=1)

#callbacks_list = [checkpoint]

# # save weights
# model.save_weights('model_weights.h5')

############################### END SAVE DATA ################################
model.fit(x=x_train, y=y_train, validation_split=0.1, epochs=1, batch_size=32, shuffle='True')

# saves the training data
model.save(filepath='model.h5')

Once you load the saved model and retrain, the loss and accuracy start from the previous stopped values.

44800/45000 [============================>.] - ETA: 0s - loss: 1.9399 - acc: 0.3044832/45000 [============================>.] - ETA: 0s - loss: 1.9398 - acc: 0.3044864/45000 [============================>.] - ETA: 0s - loss: 1.9397 - acc: 0.3044896/45000 [============================>.] - ETA: 0s - loss: 1.9396 - acc: 0.3044928/45000 [============================>.] - ETA: 0s - loss: 1.9397 - acc: 0.3044960/45000 [============================>.] - ETA: 0s - loss: 1.9397 - acc: 0.3044992/45000 [============================>.] - ETA: 0s - loss: 1.9395 - acc: 0.3045000/45000 [==============================] - 82s 2ms/step - loss: 1.9395 - acc: 0.3030 - val_loss: 1.7316 - val_acc: 0.3852

In the next run,

Epoch 1/1 32/45000 [..............................] - ETA: 3:13 - loss: 1.7473 - acc: 0. 64/45000 [..............................] - ETA: 2:15 - loss: 1.7321 - acc: 0. 96/45000 [..............................] - ETA: 1:58 - loss: 1.6830 - acc: 0. 128/45000 [..............................] - ETA: 1:48 - loss: 1.6729 - acc: 0. 160/45000 [..............................] - ETA: 1:41 - loss: 1.6876 - acc: 0.

However, note that compiling the model is not necessary when you load the model from the file.

Achintha Ihalage
  • 2,310
  • 4
  • 20
  • 33
0

As far as I understand, your problem is to continue training if you closed your training session.

Here are some useful sources for your problems.

Generally speaking, you can check for the sources with, checkpoints and resuming training with keras.

Hope it resolves.

Physicing
  • 532
  • 5
  • 17