0

I've a dataset where I need to predict the target, that it is 0 or 1,
for me is good to know the prediction is near to 0, like 0.20 or near to 1, like 0.89 and so on.

my model structure is this:

model = Sequential()

model.add(Conv1D(filters=32, kernel_size=2, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=1, strides=1))
model.add(LSTM(128, return_sequences=True, recurrent_dropout=0.2,activation='relu'))
model.add(Dense(128, activation="relu",
                kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4), 
                bias_regularizer=regularizers.l2(1e-4),
                activity_regularizer=regularizers.l2(1e-5)))
model.add(Dropout(0.4))

model.add(Conv1D(filters=32, kernel_size=2, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=1, strides=1))
model.add(LSTM(64, return_sequences=True,activation='relu'))
model.add(Dense(64, activation="relu",kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4), 
    bias_regularizer=regularizers.l2(1e-4),
    activity_regularizer=regularizers.l2(1e-5)))
model.add(Dropout(0.4))

model.add(Conv1D(filters=32, kernel_size=2, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=1, strides=1))
model.add(LSTM(32, return_sequences=True, recurrent_dropout=0.2, activation='relu'))
model.add(Dense(32, activation="relu",kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4), 
    bias_regularizer=regularizers.l2(1e-4),
    activity_regularizer=regularizers.l2(1e-5)))
model.add(Dropout(0.4))

model.add(BatchNormalization())

model.add(Dense(1, activation='linear'))

from keras.metrics import categorical_accuracy
model.compile(optimizer='rmsprop',loss="mse",metrics=['accuracy'])
model.fit(X_train,y_train,epochs=1000, batch_size=16, verbose=1, validation_split=0.1, callbacks=callback)

Summary of model is here: https://pastebin.com/Ba6ErEzj

Verbosity on training is:

Epoch 58/1000
277/277 [==============================] - 1s 5ms/step - loss: 0.2510 - accuracy: 0.4937 - val_loss: 0.2523 - val_accuracy: 0.4878
Epoch 59/1000
277/277 [==============================] - 1s 5ms/step - loss: 0.2515 - accuracy: 0.4941 - val_loss: 0.2504 - val_accuracy: 0.5122

How can I improve that? accuracy around 0.50 on 0 or 1 output is useless.
This is my Colab code.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Mariano
  • 473
  • 3
  • 12
  • *the target, that it is 0 or 1*, then you are doing **classification**, not regression. Try to change `model.add(Dense(1, activation='sigmoid'))` and loss: `'binary_crossentropy'`. – Frightera Mar 11 '21 at 08:35
  • @Frightera already tried, but nothing good. Accuracy is the same around 0.50 – Mariano Mar 11 '21 at 08:37
  • You have L1-L2 Regularizers, Dropout, and BatchNorm at the same time, so you might be underfitting the data because of the regularizations. – Frightera Mar 11 '21 at 08:40
  • Where is better to set L1-L1 Regularization? at last layer only? – Mariano Mar 11 '21 at 08:44
  • If you conclude that you are overfitting, you can use it. But first try *dropouts*. – Frightera Mar 11 '21 at 08:54
  • I've removed BatchNorm, Kernel/bias and other normalizzations, only Dropout after every Dense, but it is the same. 0.5 – Mariano Mar 11 '21 at 09:12

1 Answers1

2

To wrap-up suggestions (some already offered in the comments), with some justification...

Mistakes. You are in a binary classification setting, so:

  • Using MSE is wrong; you should use loss='binary_crossentropy'
  • In your last single-node layer, you should use activation='sigmoid'.

Best practices. Things like dropout, batch normalization, and kernel & bial regularizers are used for regularization, i.e. (roughly speaking) to avoid overfitting. They should not be used by default, and doing so is well-known to prevent learning (as it seems to be the case here):

  • Remove all dropout layers
  • Remove all batch normalization layers
  • Remove all kernel, bias, and activity regularizers.

You can consider adding some of these back step by step later, but only if you see signs of overfitting.

General advice. Nowadays, usually the first choice for an optimizer is Adam, so change to optimizer='adam' as a first approach.

That said, at the end of the day, everything depends on your data (both their quantity & quality) and the particular problem to be addressed. Experimentation is king (but keeping in mind the general principles stated above).

desertnaut
  • 57,590
  • 26
  • 140
  • 166
  • thanks for your full explanation, I've tried to fix my problems, but accuracy is the same. I've edited my original post with Colab link. There as something I'm doing wrong – Mariano Mar 11 '21 at 11:20
  • @RogerAI what do yo mean "*near 0*" and "*near 1*"? If you are interested in predicting *exact* values, then this is a *regression* problem, and [accuracy is meaningless](https://stackoverflow.com/questions/48775305/what-function-defines-accuracy-in-keras-when-the-loss-is-mean-squared-error-mse). – desertnaut Mar 11 '21 at 11:26
  • Yes, I need to predict the exact value, but it is tolerable to make a regression problem, and get back a prediction like 0.90 0.80 etc. – Mariano Mar 11 '21 at 11:29
  • @RogerAI then, as said, accuracy is meaningless, thus you cannot say that "*accuracy around 0.50 on 0 or 1 output is useless*". You should stick to MSE for assessing the performance. – desertnaut Mar 11 '21 at 11:31
  • 1
    @RogerAI This is kinda mis-leading: *I need to predict the target, that it is 0 or 1*. – Frightera Mar 11 '21 at 11:35
  • @desertnaut, you are right. i've changed mean_squared_error as loss function, but nothing better. I've shared a colab, I think with it is better understandable for catch the issue – Mariano Mar 11 '21 at 11:36
  • @RogerAI again, what do you mean "*nothing better*"? An "accuracy" of ~ 0.5? I have been trying to explain that this **does not mean anything** (good or bad) in a *regression* problem (i.e. MSE loss), and you should not use the accuracy at all - it is useless/meaningless/inappropriate as a metric. – desertnaut Mar 11 '21 at 11:38
  • @desertnaut thank you, i've read your post: "accuracy is meaningless". There are something I haven't understand to how solve my problem, if it is a regression or classification, what do you think? – Mariano Mar 11 '21 at 11:58
  • @RogerAI cannot say, neither is this the appropriate forum. I suggest posting a question at [Data Science SE](https://datascience.stackexchange.com/help/on-topic); no need to post code, but be as *specific* and *detailed* you can about the *conceptual* description of the problem. – desertnaut Mar 11 '21 at 12:04
  • @RogerAI when I suggested to be *detailed* and *specific* about the *conceptual* part of the problem (i.e. is it better to approach it as a regression or a classification one?), I certainly did not mean [this](https://datascience.stackexchange.com/questions/90516/keras-model-not-produce-a-good-prediction); and what is "*this mother*"? :( – desertnaut Mar 11 '21 at 12:48
  • @desertnaut sorry wrong to write, I mean "this model". – Mariano Mar 11 '21 at 13:21
  • issue was batch_size :) too low, for fitting as well – Mariano Mar 12 '21 at 08:28