I am having some problems understanding how to retrieve the predictions from a Keras model.
I want to build a simple system that predicts the next word, but I don't know how to output the complete list of probabilities for each word.
This is my code right now:
model = Sequential()
model.add(Embedding(vocab_size, embedding_size, input_length=55, weights=[pretrained_weights]))
model.add(Bidirectional(LSTM(units=embedding_size)))
model.add(Dense(23690, activation='softmax')) # 23690 is the total number of classes
model.compile(loss='categorical_crossentropy',
optimizer = RMSprop(lr=0.0005),
metrics=['accuracy'])
# fit network
model.fit(np.array(X_train), np.array(y_train), epochs=10)
score = model.evaluate(x=np.array(X_test), y=np.array(y_test), batch_size=32)
prediction = model.predict(np.array(X_test), batch_size=32)
First question: Training set: list of sentences (vectorized and transformed to indices). I saw some examples online where people divide X_train and y_train like this:
X, y = sequences[:,:-1], sequences[:,-1]
y = to_categorical(y, num_classes=vocab_size)
Should I instead transform the X_train and the y_train in order to have sliding sequences, where for example I have
X = [[10, 9, 4, 5]]
X_train = [[10, 9], [9, 4], [4, 5]]
y_train = [[9], [4], [5]]
Second question: Right now the model returns only one element for each input. How can I return the predictions for each word? I want to be able to have an array of output words for each word, not a single output. I read that I could use a TimeDistributed layer, but I have problems with the input, because the Embedding layer takes a 2D input, while the TimeDistributed takes a 3D input.
Thank you for the help!