I am trying to train and make predictions with an LSTM model using tf.keras. I have written code in two different files, LSTMTraining.py which trains the Keras Model (and save it to a file), and Predict.py, which is supposed to load in the Keras model and use it to make predictions. For some reason, when I load the model in Predict.py, it starts training, even though I have not used the model.fit() command in that file. Why is this happening?
I have saved the model into multiple different file formats. For example, I've tried saving the model's architecture into a JSON file (using model_to_json()), and saving the weights seperately, then loading both of these files in seperately and then combining them. I've also tried saving them together into one file (using model.save()), and loading that in.
Creating and Training Model in LSTMTraining.py (Note: the log_similarity_loss was just a custom loss function I created for the model):
# Machine learning
import tensorflow as tf
from tensorflow.python.keras import layers
import numpy as np
# Load/save data
import pickle
import os
# Shuffling
from sklearn.utils import shuffle
# Parameters
epochs = 5
display_step = 1000
n_input = 5
wordvec_len = 5
n_hidden = 512
recurrent_dropout = 0
dropout = 0
# Load data
with open("Vectorized_Word_By_Word.txt", "rb") as data:
vectorized_txt = pickle.load(data)
# Prepare data into format for training (x: [prev-words], y: [next-word])
x_train, y_train = [], []
for n in range(0, len(vectorized_txt) - n_input - 1):
prev_words = vectorized_txt[n: n+5]
next_word = vectorized_txt[n+6]
x_train.append(prev_words)
y_train.append(next_word)
x_train, y_train = np.array(x_train), np.array(y_train)
x_train, y_train = shuffle(x_train, y_train, random_state=0)
def log_similarity_loss(y_actual, y_pred):
"""Log similarity loss calculation."""
cos_similarity = tf.keras.losses.CosineSimilarity(axis=0)(y_actual, y_pred)
scaled_similarity = tf.add(tf.multiply(0.5, cos_similarity), 0.5)
return -0.5*tf.math.log(scaled_similarity)
log_similarity_loss(
[0.05, 0.01, 0.05, 1.2], [0.05, -0.01, 0.05, -1.2])
model = tf.keras.Sequential([
layers.LSTM(n_hidden, input_shape=(n_input, wordvec_len),
dropout=dropout, recurrent_dropout=recurrent_dropout,
return_sequences=True),
layers.LSTM(n_hidden, dropout=dropout,
recurrent_dropout=recurrent_dropout),
layers.Dense(wordvec_len)
])
model.compile(loss=log_similarity_loss,
optimizer='adam', metrics=['cosine_proximity'])
model.fit(x_train, y_train, epochs=epochs, batch_size=12)
model.save("Keras_Model.h5", include_optimizer=True, save_format='h5')
# Save model weights and architecture
model.save_weights('model_weights.h5')
with open("model_architecture.json", "w") as json_file:
json_file.write(model.to_json())
Loading in the model in Predict.py (Note: All the functions imported from "WordModel.py" are just functions for text processing I've written that are unrelated to Keras):
from WordModel import word_by_word, word_to_vec, vec_to_word
import gensim
import tensorflow as tf
from tensorflow.python.keras.models import load_model, model_from_json
with open('model_architecture.json', 'r') as json_file:
model_json = json_file.read()
keras_model = model_from_json(model_json)
keras_model.load_weights("model_weights.h5")
I was expecting no output, just the model to be loaded. However, I got the verbose training output of the model like so (when running Predict.py):
12/1212 [..............................] - ETA: 3:32 - loss: 0.2656 - cosine_proximity: 0.0420
24/1212 [..............................] - ETA: 1:55 - loss: 0.2712 - cosine_proximity: 0.2066
36/1212 [..............................] - ETA: 1:24 - loss: 0.2703 - cosine_proximity: 0.2294
48/1212 [>.............................] - ETA: 1:08 - loss: 0.2394 - cosine_proximity: 0.2690
60/1212 [>.............................] - ETA: 58s - loss: 0.2286 - cosine_proximity: 0.2874
72/1212 [>.............................] - ETA: 52s - loss: 0.2247 - cosine_proximity: 0.2750
84/1212 [=>............................] - ETA: 47s - loss: 0.2115 - cosine_proximity: 0.2924
and so on.
Note that I have not made any training command in my Predict.py file. I have rerun the code multiple times, and made sure that I was running the correct file. Still, nothing seems to be working.
Thanks for the help!