I've written an LSTM network with Keras (following code):
df = pd.read_csv("../data/training_data.csv")
# Group by and pivot the data
group_index = df.groupby('group').cumcount()
data = (df.set_index(['group', group_index])
.unstack(fill_value=0).stack())
# getting np array of the data and labeling
# on the label group we take the first label because it is the same for all
target = np.array(data['label'].groupby(level=0).apply(lambda x: [x.values[0]]).tolist())
data = data.loc[:, data.columns != 'label']
data = np.array(data.groupby(level=0).apply(lambda x: x.values.tolist()).tolist())
# shuffel the training set
data, target = shuffle(data, target)
# spilt data to train and test
x_train, x_test, y_train, y_test = train_test_split(data, target, test_size=0.2, random_state=4)
# ADAM Optimizer with learning rate decay
opt = optimizers.Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0001)
# build the model
model = Sequential()
num_features = data.shape[2]
num_samples = data.shape[1]
model.add(LSTM(8, batch_input_shape=(None, num_samples, num_features), return_sequences=True, activation='sigmoid'))
model.add(LeakyReLU(alpha=.001))
model.add(Dropout(0.2))
model.add(LSTM(4, return_sequences=True, activation='sigmoid'))
model.add(LeakyReLU(alpha=.001))
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer=opt,
metrics=['accuracy', keras_metrics.precision(), keras_metrics.recall(),f1])
model.summary()
# Training, getting the results history for plotting
history = model.fit(x_train, y_train, epochs=3000, validation_data=(x_test, y_test))
The monitored metrics are loss, accuracy, precision, recall and f1 score.
I've noticed that the validation loss metric start to climb around 300 epochs, so I've figured overfitting! however, recall is still climbing and precision is slightly improving.
Why is that? is my model overfitted?