Usually, when using a NN, I do the normalization in this form:
scaler = StandardScaler()
train_X = scaler.fit_transform( train_X )
test_X = scaler.transform( test_X )
That is, I normalize after the split, so that there are no leaks from the test set to the train set. But I am having doubts about this when using a LSTM.
Imagine that my last sequence in the train set in a LSTM is X = [x6, x7, x8], Y = [x9].
Then, my first sequence in the test set should be X = [x7, x8, x9], Y = [x10].
So, does it make sense to normalize the data after splitting if I end up mixing the values from the two sets in the X of the test set? Or should I normalize the entire dataset before with
scaler = StandardScaler()
data = scaler.fit_transform( data )
and then do the split?