0

I am building a hashtag recommendation model for twitter media posts, which takes tweet text as input and does 300-dimensional word embedding on it and classifies it among 198 hashtags as classes. When I run my model I get 0.9998 accuracy from the beginning which does not change later! What is wrong in my model?

import numpy as np
import pickle
from keras.layers.normalization import BatchNormalization
from keras.models import Sequential, load_model
from keras.layers import Dense, Dropout, Activation,LSTM, Embedding
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import regularizers, initializers
package="2018_pickle"
with open(path1,"rb") as f:
    maxLen,l_h2i,l_w2i=pickle.load(f)
with open(path2,"rb") as f:
    X_train,X_test,X_train_indices,X_test_indices=pickle.load(f)
with open(path3,"rb") as f:
    Y_train,Y_test,Y_train_oh,Y_test_oh=pickle.load(f)
with open(path4,"rb") as f:
    emd_matrix=pickle.load(f)


if __name__ == '__main__':
modelname="model_1"
train=False
vocab_size = len(emd_matrix)
emd_dim=emd_matrix.shape[1]
if train:
    model = Sequential()
    model.add(Embedding(vocab_size , emd_dim, weights=[emd_matrix]
                        ,input_length=maxLen,trainable=False))
    model.add(LSTM(256,return_sequences=True,activation="relu",
                   kernel_regularizer=regularizers.l2(0.01),
                   kernel_initializer=initializers.glorot_normal(seed=None)))
    model.add(LSTM(256,return_sequences=True,activation="relu",
                   kernel_regularizer=regularizers.l2(0.01),
                   kernel_initializer=initializers.glorot_normal(seed=None)))
    model.add(LSTM(256,return_sequences=False,activation="relu",
                   kernel_regularizer=regularizers.l2(0.01),
                   kernel_initializer=initializers.glorot_normal(seed=None)))
    model.add(Dense(198,activation='softmax'))
    model.compile(loss='binary_crossentropy', optimizer='adam',
                  metrics=['accuracy'])
    checkpoint = ModelCheckpoint(filepath, monitor="loss",
                                 verbose=1, save_best_only=True, mode='min')
    reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.5,
                                  patience=2, min_lr=0.000001)
    history=model.fit(X_train_indices, Y_train_oh, batch_size=2048,
                      epochs=5, validation_split=0.1, shuffle=True,
                      callbacks=[checkpoint, reduce_lr])


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_10 (Embedding)     (None, 54, 300)           22592100  
_________________________________________________________________
lstm_18 (LSTM)               (None, 54, 256)           570368    
_________________________________________________________________
lstm_19 (LSTM)               (None, 54, 256)           525312    
_________________________________________________________________
lstm_20 (LSTM)               (None, 256)               525312    
_________________________________________________________________
dense_7 (Dense)              (None, 198)               50886     
=================================================================
Total params: 24,263,978
Trainable params: 1,671,878
Non-trainable params: 22,592,100
_________________________________________________________________
ramin karimian
  • 113
  • 2
  • 9

1 Answers1

1

Most probably, this is due to the mistaken use of loss='binary_crossentropy' in a multi-class classification problem (see Keras binary_crossentropy vs categorical_crossentropy performance? for more details).

You should change your model compilation to

model.compile(loss='categorical_crossentropy', optimizer='adam',
                  metrics=['accuracy'])
desertnaut
  • 57,590
  • 26
  • 140
  • 166
  • Thanks for your answer. I changed it in that way now there is other problem which is the accuracy is below than 0.001 @desertnaut – ramin karimian Sep 04 '19 at 16:17