I have a dataset that contains activity data of 1000 users. Since the activity of one user differs from another user, I want the user attribute also send to the LSTM RNN model so that the model can learn better about each user's behavior. The snippet of my dataset is as below:
https://i.stack.imgur.com/poL31.jpg
I tried with one-hot encoding and binary encoding of categorical information, but the model did not produce good results. But applying the LSTM RNN model on a single user's data (excluding user variable) produces good results.
The snippet of my lstm autoencoder model for anomaly detection is as below:
inputs = Input(shape = (timesteps, n_features))
L1 = LSTM(encoding_dim, activation='relu', return_sequences=True,
kernel_regularizer=regularizers.l2(0.00))(inputs)
L2 = LSTM(hidden_dim, activation='relu', return_sequences=False)(L1)
L3 = RepeatVector(timesteps)(L2)
L4 = LSTM(hidden_dim, activation='relu', return_sequences=True)(L3)
L5 = LSTM(encoding_dim, activation='relu', return_sequences=True)(L4)
output = TimeDistributed(Dense(n_features))(L5)
lstm_model = Model(inputs=inputs, outputs=output)
lstm_model.summary()
For now I tried with,
n_features = 22; no. of features [ 1(categorical with one-hot encoding) + 21 (numerical)]
encoding_dim = 16
hidden_dim = 8
How can I better handle categorical attribute i.e. user variable with this model?