0

I have 8 classes that I want to predict from input text. Here is my code for preprocessing the data:

num_max = 1000
tok = Tokenizer(num_words=num_max)
tok.fit_on_texts(x_train)
mat_texts = tok.texts_to_matrix(x_train,mode='count')
num_max = 1000
tok = Tokenizer(num_words=num_max)
tok.fit_on_texts(x_train)
max_len = 100
cnn_texts_seq = tok.texts_to_sequences(x_train)
print(cnn_texts_seq[0])

[12, 4, 303]

# padding the sequences
cnn_texts_mat = sequence.pad_sequences(cnn_texts_seq,maxlen=max_len)
print(cnn_texts_mat[0])
print(cnn_texts_mat.shape)

[  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0  12   4 303]

(301390, 100)

Below is the structure of my model which contains an embedding layer:

max_features = 20000
max_features = cnn_texts_mat.shape[1]
maxlen = 100
embedding_size = 128


model = Sequential()
model.add(Embedding(max_features, embedding_size, input_length=maxlen))
model.add(Dropout(0.2))

model.add(Dense(5000, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(600, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(units=y_train.shape[1], activation='softmax'))

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy',
              optimizer=sgd)

Below is the model summary:

model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_5 (Embedding)      (None, 100, 128)          12800     
_________________________________________________________________
dropout_13 (Dropout)         (None, 100, 128)          0         
_________________________________________________________________
dense_13 (Dense)             (None, 100, 5000)         645000    
_________________________________________________________________
dropout_14 (Dropout)         (None, 100, 5000)         0         
_________________________________________________________________
dense_14 (Dense)             (None, 100, 600)          3000600   
_________________________________________________________________
dropout_15 (Dropout)         (None, 100, 600)          0         
_________________________________________________________________
dense_15 (Dense)             (None, 100, 8)            4808      
=================================================================
Total params: 3,663,208
Trainable params: 3,663,208
Non-trainable params: 0

After this, I am getting below error when I try to run the model:

model.fit(x=cnn_texts_mat, y=y_train, epochs=2, batch_size=100)

ValueError                                Traceback (most recent call last)
<ipython-input-41-4b9da9914e7e> in <module>
----> 1 model.fit(x=cnn_texts_mat, y=y_train, epochs=2, batch_size=100)

~/.local/lib/python3.5/site-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
    950             sample_weight=sample_weight,
    951             class_weight=class_weight,
--> 952             batch_size=batch_size)
    953         # Prepare validation data.
    954         do_validation = False

~/.local/lib/python3.5/site-packages/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
    787                 feed_output_shapes,
    788                 check_batch_axis=False,  # Don't enforce the batch size.
--> 789                 exception_prefix='target')
    790 
    791             # Generate sample-wise weight values given the `sample_weight` and

~/.local/lib/python3.5/site-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
    126                         ': expected ' + names[i] + ' to have ' +
    127                         str(len(shape)) + ' dimensions, but got array '
--> 128                         'with shape ' + str(data_shape))
    129                 if not check_batch_axis:
    130                     data_shape = data_shape[1:]

ValueError: Error when checking target: expected dense_15 to have 3 dimensions, but got array with shape (301390, 8)
today
  • 32,602
  • 8
  • 95
  • 115
StatguyUser
  • 2,595
  • 2
  • 22
  • 45

1 Answers1

2

Look at the output shape of the last layer in model summary: it is (None, 100, 8). This is not what you are looking for. Your labels for each sample have a shape of (8,) and not (100,8). Why this happened? That's because the Dense layer is applied on the last axis of its input and therefore since the output of Embedding layer is 3D, the output of all the following Dense layers would also have 3 dimension.

How to resolve this? One approach is to use Flatten layer somewhere in your model (possibly right after the embedding layer). This way you would have a 2D ouput of shape (None, 8) which is what you want and is consistent with the shape of labels.

However, note that you may end up with a very big model (i.e. too many parameters) which would be highly prone to overfitting. Either reduce the number of units in the Dense layers or alternatively use Conv1D and MaxPooling1D layers or even RNN layers to process the embeddings and reduce the dimensionality of resulting tensors (which is also probable that using them would also increase the accuracy of the model as well).

today
  • 32,602
  • 8
  • 95
  • 115