1

I am trying to train and make predictions with an LSTM model using tf.keras. I have written code in two different files, LSTMTraining.py which trains the Keras Model (and save it to a file), and Predict.py, which is supposed to load in the Keras model and use it to make predictions. For some reason, when I load the model in Predict.py, it starts training, even though I have not used the model.fit() command in that file. Why is this happening?

I have saved the model into multiple different file formats. For example, I've tried saving the model's architecture into a JSON file (using model_to_json()), and saving the weights seperately, then loading both of these files in seperately and then combining them. I've also tried saving them together into one file (using model.save()), and loading that in.

Creating and Training Model in LSTMTraining.py (Note: the log_similarity_loss was just a custom loss function I created for the model):

# Machine learning
import tensorflow as tf
from tensorflow.python.keras import layers
import numpy as np

# Load/save data
import pickle
import os

# Shuffling
from sklearn.utils import shuffle

# Parameters
epochs = 5
display_step = 1000
n_input = 5
wordvec_len = 5
n_hidden = 512
recurrent_dropout = 0
dropout = 0

# Load data
with open("Vectorized_Word_By_Word.txt", "rb") as data:
    vectorized_txt = pickle.load(data)

# Prepare data into format for training (x: [prev-words], y: [next-word])
x_train, y_train = [], []
for n in range(0, len(vectorized_txt) - n_input - 1):
    prev_words = vectorized_txt[n: n+5]
    next_word = vectorized_txt[n+6]
    x_train.append(prev_words)
    y_train.append(next_word)
x_train, y_train = np.array(x_train), np.array(y_train)
x_train, y_train = shuffle(x_train, y_train, random_state=0)


def log_similarity_loss(y_actual, y_pred):
    """Log similarity loss calculation."""
    cos_similarity = tf.keras.losses.CosineSimilarity(axis=0)(y_actual, y_pred)
    scaled_similarity = tf.add(tf.multiply(0.5, cos_similarity), 0.5)
    return -0.5*tf.math.log(scaled_similarity)


log_similarity_loss(
    [0.05, 0.01, 0.05, 1.2], [0.05, -0.01, 0.05, -1.2])

model = tf.keras.Sequential([
    layers.LSTM(n_hidden, input_shape=(n_input, wordvec_len),
                dropout=dropout, recurrent_dropout=recurrent_dropout,
                return_sequences=True),
    layers.LSTM(n_hidden, dropout=dropout,
                recurrent_dropout=recurrent_dropout),
    layers.Dense(wordvec_len)
])

model.compile(loss=log_similarity_loss,
              optimizer='adam', metrics=['cosine_proximity'])

model.fit(x_train, y_train, epochs=epochs, batch_size=12)

model.save("Keras_Model.h5", include_optimizer=True, save_format='h5')

# Save model weights and architecture
model.save_weights('model_weights.h5')
with open("model_architecture.json", "w") as json_file:
    json_file.write(model.to_json())

Loading in the model in Predict.py (Note: All the functions imported from "WordModel.py" are just functions for text processing I've written that are unrelated to Keras):

from WordModel import word_by_word, word_to_vec, vec_to_word
import gensim

import tensorflow as tf
from tensorflow.python.keras.models import load_model, model_from_json

with open('model_architecture.json', 'r') as json_file:
    model_json = json_file.read()

keras_model = model_from_json(model_json)
keras_model.load_weights("model_weights.h5")

I was expecting no output, just the model to be loaded. However, I got the verbose training output of the model like so (when running Predict.py):

  12/1212 [..............................] - ETA: 3:32 - loss: 0.2656 - cosine_proximity: 0.0420
  24/1212 [..............................] - ETA: 1:55 - loss: 0.2712 - cosine_proximity: 0.2066
  36/1212 [..............................] - ETA: 1:24 - loss: 0.2703 - cosine_proximity: 0.2294
  48/1212 [>.............................] - ETA: 1:08 - loss: 0.2394 - cosine_proximity: 0.2690
  60/1212 [>.............................] - ETA: 58s - loss: 0.2286 - cosine_proximity: 0.2874 
  72/1212 [>.............................] - ETA: 52s - loss: 0.2247 - cosine_proximity: 0.2750
  84/1212 [=>............................] - ETA: 47s - loss: 0.2115 - cosine_proximity: 0.2924 

and so on.

Note that I have not made any training command in my Predict.py file. I have rerun the code multiple times, and made sure that I was running the correct file. Still, nothing seems to be working.

Thanks for the help!

ag2718
  • 101
  • 2
  • 9
  • Are you sure you are running the right file? – Dr. Snoopy Sep 23 '19 at 00:43
  • Yes, I’ve double checked. I first ran the LSTMTraining.py file, and then ran the Predict.py file. It is showing the training message even when I run the Predict.py file. – ag2718 Sep 23 '19 at 21:18
  • It'd help to have the _full_ source code for both _LSTMTraining.py_ and _Predict.py_ (not JSON); also, how are you executing _Predict.py_ - from an IPython Kernel in a different _.py_ file, right from _Predict.py_, or from a command terminal? – OverLordGoldDragon Sep 28 '19 at 22:54
  • I've updated the post to include the full source code. I have tried running Predict.py from both the Windows command prompt and the actual file (in VS Code editor), but neither are working. – ag2718 Sep 29 '19 at 23:05
  • Tip: mention users by name to notify them - stumbled upon your question coincidentally since it's bumped when updated – OverLordGoldDragon Sep 29 '19 at 23:09
  • Works fine for me - what IDE are you using? (Spyder, VSCode, PyCharm, etc) Also, have you tried running `Predict.py` code straight from `LSTMTraining.py`, but _excluding all other lines_? (i.e. paste it in, comment out everything else, restart kernel, then run) – OverLordGoldDragon Sep 29 '19 at 23:24
  • @OverLordGoldDragon I am using VSCode. When I copied and pasted code from Predict.py into LSTMTraining.py, it did not show the training message twice (so I'm assuming it worked). However, I ran the Predict.py file separately again and it still appeared to be retraining the model. – ag2718 Sep 29 '19 at 23:39
  • Yeah, VSCode can be a pain for Python - I'm not entirely sure what the case is for you, but one likely cause is that you're running _all_ `.py` scripts when you run `Predict.py`, especially if they're in the same project/folder/etc -- carefully check for this not to be the case. If no avail, you're better off opening another question for debugging VSCode Python concurrent scripts, as your code, as far as Python, TensorFlow, and Keras are concerned, is fine - and should not result in the behavior you see. – OverLordGoldDragon Sep 29 '19 at 23:45
  • Also, I'd recommend [Spyder](https://www.spyder-ide.org/) for Python, as it's also tailored for data science / machine learning and is well-integrated with [Anaconda](https://www.anaconda.com/distribution/). – OverLordGoldDragon Sep 29 '19 at 23:47
  • @OverLordGoldDragon Thank you for all of the suggestions! I will try downloading Spyder and see if the code works. – ag2718 Sep 30 '19 at 00:23
  • Gave a more detailed answer - let me know how things go – OverLordGoldDragon Sep 30 '19 at 00:36

1 Answers1

0

The problem's likely with your VSCode IDE, which takes additional configuring to work both with Python and its packages -- when you run one script, you may be running all the scripts, thus the seen behavior. A solution I'd recommend is switching to Spyder and installing your packages with Anaconda. Once you've installed both, search "anaconda command prompt" or "anaconda powershell" on your PC, and in the terminal, type:

conda update conda
conda update --all
conda install numpy # optional (sort of)
conda install matplotlib # optional (sort of)
# SEE BELOW
conda install -c conda-forge keras
conda update --all # final 'cleanup' command - ensures package compatibility

If you plan on using a GPU (highly recommended), you'll need to first download CUDA - instructions here (get CUDA 10 instead of 9 in the article). Then run conda install tensorflow-gpu as in the article.

Then, in Spyder: Tools -> Preferences -> PYTHONPATH manager -> add all folders of the modules/data you plan to use, so you don't have to %cd each time or worry about relative pathing and can import directly. Lastly, make sure Anaconda & Spyder use the right Python interpreter.

Restart Spyder, run scripts - assuming no bugs, all should be well.

OverLordGoldDragon
  • 1
  • 9
  • 53
  • 101