How to make a prediction on strings with LSTM

Question

I am trying to make a model with LSTM for chess move prediction.
I have a dataset of games and for each game I have the list of moves.

#for example:
[b3, c5, Bb2, Nc6, g3, d6, Bg2, Nf6, c4, a6, Nc3, ...]

Since LSTM models don't process stings I converted the moves with a keras tokenizer. More specifically I first made a set of moves and then encoded them as integers. Once this procedure was done I put these integers in a list like this :

[7464, 2678, 2467, 1114, 6944, 2801, 7937, 7743, 1503, 6348, 5215, ...]

from this list I created two other lists called X and Y. These other two lists are formed like this:

#X: 
[[7464, 2678, 2467, 1114, 6944],
         [2678, 2467, 1114, 6944, 2801],
         [2467, 1114, 6944, 2801, 7937],
         [1114, 6944, 2801, 7937, 7743],
         [6944, 2801, 7937, 7743, 1503],
         [2801, 7937, 7743, 1503, 6348],
         [7937, 7743, 1503, 6348, 5215], ...]

#Y: 
[2801, 7937, 7743, 1503, 6348, 5215, ...]

Once I created X and Y I converted them into a numpy array and reshaped X as follows:

n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))
        
#X:
[[[7464]
[2678]
[2467]
[1114]
[6944]]
        
[[2678]
[2467]
[1114]
[6944]
[2801]] ...]

#Y:
[2801 7937 7743 1503 6348 5215 ...]

Before moving on to the model I have to make a small premise.
Since this is a first run I passed the model to fit just one batch to see if everything works well.

Then I passed these two structures modeled as follows to the model below to do the FIT:

X= np.asarray(X).astype(np.int)
Y= np.asarray(Y).astype(np.int)

model = tf.keras.Sequential()
model.add(layers.LSTM(50, activation='relu', input_shape=(n_steps, n_features)))
model.add(layers.Dense(1))

model.compile(optimizer=tf.keras.optimizers.Adam(0.01), loss=tf.keras.losses.MeanSquaredError(), metrics=['accuracy'])
model.fit(X, Y, epochs=200, verbose=1)

and then I try to make a prediction with the first 5 elements of the match that I passed to it as a train but the prediction is wrong. In addition to returning a move that is not possible to make, it returns a different move from the one it learned and since I did the train with only one match, I think it should give me a known result.

#A code example for predicting:
test_data = np.array([7464, 2678, 2467 ,1114 ,6944])
test_data = test_data.reshape((1, n_steps, n_features))
predictNextNumber = model.predict(test_data, verbose=1)
print(predictNextNumber)
predictNextNumber= int(predictNextNumber)
text = tokenizer.sequences_to_texts([[predictNextNumber]])
print(text)

#Output:
1/1 [==============================] - 0s 493ms/step
[[1061.2649]]
['g1=q#']

Can anyone tell me where I am going wrong?

I have recently considered an option. Is it possible that the model I have been using is too much based on integer prediction and therefore looking for a numerical relationship between the input and what it has learned? Because in this case I would need a model that still learns the positions of the numbers and not a mathematical correlation between them.

One last question, once everything works with a match, how do I pass at least half of the matches in the dataset to the model to make in more consistent fit?

I haven't done any chess NN stuff but I am pretty sure your model representation isn't optimal. It does not tell the network anything about the global game state. Also using an MSE loss will probably not meet your optimization goal. Have you checked [this](https://stackoverflow.com/questions/753954/how-to-program-a-neural-network-for-chess) ? — lwi, Jan 02 '22 at 20:50
thank you for your reply @lwi . I have not checked this, I see it now, thank you! How can I pass the state to the neural network? And most importantly, I don't know if I'm saying the wrong thing, chess is just for contextualisation. Any sequence of moves (in any context and not focused on chess) should be analysed in the same way, right? Anyway, as I said my concern was just to have a model that is not so detailed, how can I elaborate it? — ValerioGoretti, Jan 02 '22 at 21:22

How to make a prediction on strings with LSTM

0 Answers0