I am trying to run some example Deep learning python3 code on databricks/GPU. The code is from https://www.tensorflow.org/tutorials/keras/text_classification_with_hub#evaluate_the_model
I got the results :
training loss: 0.0762 - training accuracy: 0.9929
validation_loss: 0.5734 - validation_accuracy: 0.8628
The example said
"This fairly naive approach achieves an accuracy of about 87%. With more advanced approaches, the model should get closer to 95%."
I want to find how to improve the accuracy.
From the results, I think it is overfitting. So, I tried to add l1 and l2 regularizer and dropout.
embedding = "https://tfhub.dev/google/tf2-preview/gnews-swivel-20dim/1"
hub_layer = hub.KerasLayer(embedding, input_shape=[],
dtype=tf.string, trainable=True)
tf.keras.regularizers.l1_l2(l1=0.04, l2=0.01) # L1 + L2 penalties
model = tf.keras.Sequential()
model.add(hub_layer)
model.add(tf.keras.layers.Dense(8, activation='relu'))
model.add(tf.keras.layers.Dropout(rate=0.3))
model.add(tf.keras.layers.Dense(1))
I have tried different dropout (0.2, 0.3, 0.5, 0.7) and l1/l2 regularizers (0.01, 0.02, 0.04).
I have reduced the units in the first hidden layer from 16 to 8. I have tried Reducing (Versus Delaying) Overfitting in Neural Network and how to reduce overfitting in neural networks?
But, no improvement. How can I reduce the overfitting ?
thanks