1

how to connect embedding layer with dimension (3,50) to lstm?

array (3, 50) is fed to input "layer_i_emb" where three time steps with arrays of length 50 are stored in which product identifiers are stored

I tried to connect it before reshape and it didn't work either. embedding adds dimension and lstm does not take extra dimension. it's scary that you have to translate tensors into tf and manually work with tensors.

layer_i_inp = Input(shape = (3,50), name = 'item')
layer_i_emb = Embedding(output_dim = EMBEDDING_DIM*2,
                        input_dim = us_it_count[0]+1,
                        input_length = (3,50),
                        name = 'item_embedding')(layer_i_inp) 

layer_i_emb = Reshape([3,50, EMBEDDING_DIM*2])(layer_i_emb)

layer_i_emb = LSTM(MAX_FEATURES, dropout = 0.4, recurrent_dropout = 0.4, return_sequences = True)(layer_i_emb)
layer_i_emb = LSTM(MAX_FEATURES, dropout = 0.4, recurrent_dropout = 0.4, return_sequences = True)(layer_i_emb)
layer_i_emb = LSTM(MAX_FEATURES, dropout = 0.4, recurrent_dropout = 0.4)(layer_i_emb)

layer_i_emb = Flatten()(layer_i_emb)
AloneTogether
  • 25,814
  • 5
  • 20
  • 39

1 Answers1

3

The problem is that the Embedding layer is outputting a 3D tensor, but a LSTM layer needs a 2D input (excluding the batch dimension). Here are a couple options you can try:

Option 1

import tensorflow as tf

samples = 100
orders = 3
product_ids_per_order = 50
max_product_id = 120

data = tf.random.uniform((samples, orders, product_ids_per_order), maxval=max_product_id, dtype=tf.int32)
Y = tf.random.uniform((samples,), maxval=2, dtype=tf.int32)

EMBEDDING_DIM = 5

item_input = tf.keras.layers.Input(shape = (orders, product_ids_per_order), name = 'item')
embedding_layer = tf.keras.layers.Embedding(
                        max_product_id + 1,
                        output_dim = EMBEDDING_DIM,
                        input_length = product_ids_per_order,
                        name = 'item_embedding')

# Map each time step with 50 product ids to an embedding vector of size 5
outputs = []
for i in range(orders):
  tensor = embedding_layer(item_input[:, i, :])
  layer_i_emb = tf.keras.layers.LSTM(32, dropout = 0.4, recurrent_dropout = 0.4, return_sequences = True)(tensor)
  layer_i_emb = tf.keras.layers.LSTM(32, dropout = 0.4, recurrent_dropout = 0.4, return_sequences = True)(layer_i_emb)
  layer_i_emb = tf.keras.layers.LSTM(32, dropout = 0.4, recurrent_dropout = 0.4)(layer_i_emb)
  outputs.append(layer_i_emb)
  
output = tf.keras.layers.Concatenate(axis=1)(outputs)
output = tf.keras.layers.Dense(1, activation='sigmoid')(layer_i_emb)
model = tf.keras.Model(item_input, output)
model.compile(optimizer='adam', loss=tf.keras.losses.BinaryCrossentropy())
model.fit(data, Y)
4/4 [==============================] - 15s 1s/step - loss: 0.6926

Option 2

import tensorflow as tf

samples = 100
orders = 3
product_ids_per_order = 50
max_product_id = 120

EMBEDDING_DIM = 5

item_input = tf.keras.layers.Input(shape = (orders, product_ids_per_order), name = 'item')
embedding_layer = tf.keras.layers.Embedding(
                        max_product_id + 1,
                        output_dim = EMBEDDING_DIM,
                        input_length = product_ids_per_order,
                        name = 'item_embedding')

# Map each time step with 50 product ids to an embedding vector of size 5
inputs = []
for i in range(orders):
  tensor = embedding_layer(item_input[:, i, :])
  tensor = tf.keras.layers.Reshape([product_ids_per_order*EMBEDDING_DIM])(tensor)
  tensor = tf.expand_dims(tensor, axis=1)
  inputs.append(tensor)

embedding_inputs = tf.keras.layers.Concatenate(axis=1)(inputs)
layer_i_emb = tf.keras.layers.LSTM(32, dropout = 0.4, recurrent_dropout = 0.4, return_sequences = True)(embedding_inputs)
layer_i_emb = tf.keras.layers.LSTM(32, dropout = 0.4, recurrent_dropout = 0.4, return_sequences = True)(layer_i_emb)
layer_i_emb = tf.keras.layers.LSTM(32, dropout = 0.4, recurrent_dropout = 0.4)(layer_i_emb)
output = tf.keras.layers.Dense(1, activation='sigmoid')(layer_i_emb)
model = tf.keras.Model(item_input, output)
print(model.summary())
Model: "model_11"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 item (InputLayer)              [(None, 3, 50)]      0           []                               
                                                                                                  
 tf.__operators__.getitem_41 (S  (None, 50)          0           ['item[0][0]']                   
 licingOpLambda)                                                                                  
                                                                                                  
 tf.__operators__.getitem_42 (S  (None, 50)          0           ['item[0][0]']                   
 licingOpLambda)                                                                                  
                                                                                                  
 tf.__operators__.getitem_43 (S  (None, 50)          0           ['item[0][0]']                   
 licingOpLambda)                                                                                  
                                                                                                  
 item_embedding (Embedding)     (None, 50, 5)        605         ['tf.__operators__.getitem_41[0][
                                                                 0]',                             
                                                                  'tf.__operators__.getitem_42[0][
                                                                 0]',                             
                                                                  'tf.__operators__.getitem_43[0][
                                                                 0]']                             
                                                                                                  
 reshape_10 (Reshape)           (None, 250)          0           ['item_embedding[0][0]']         
                                                                                                  
 reshape_11 (Reshape)           (None, 250)          0           ['item_embedding[1][0]']         
                                                                                                  
 reshape_12 (Reshape)           (None, 250)          0           ['item_embedding[2][0]']         
                                                                                                  
 tf.expand_dims_9 (TFOpLambda)  (None, 1, 250)       0           ['reshape_10[0][0]']             
                                                                                                  
 tf.expand_dims_10 (TFOpLambda)  (None, 1, 250)      0           ['reshape_11[0][0]']             
                                                                                                  
 tf.expand_dims_11 (TFOpLambda)  (None, 1, 250)      0           ['reshape_12[0][0]']             
                                                                                                  
 concatenate_13 (Concatenate)   (None, 3, 250)       0           ['tf.expand_dims_9[0][0]',       
                                                                  'tf.expand_dims_10[0][0]',      
                                                                  'tf.expand_dims_11[0][0]']      
                                                                                                  
 lstm_34 (LSTM)                 (None, 3, 32)        36224       ['concatenate_13[0][0]']         
                                                                                                  
 lstm_35 (LSTM)                 (None, 3, 32)        8320        ['lstm_34[0][0]']                
                                                                                                  
 lstm_36 (LSTM)                 (None, 32)           8320        ['lstm_35[0][0]']                
                                                                                                  
 dense_11 (Dense)               (None, 1)            33          ['lstm_36[0][0]']                
                                                                                                  
==================================================================================================
Total params: 53,502
Trainable params: 53,502
Non-trainable params: 0
__________________________________________________________________________________________________
None
AloneTogether
  • 25,814
  • 5
  • 20
  • 39
  • some of this is already in the code in the colab. and what to do with it is not clear. the tensor form is such because it takes into account three time steps in each step where we have a basket of goods (50 product ids) and the product ids should be in the form of an embedding since there are 20 thousand of them) i.e. I submit 50 product IDs each time step. and embedding decided to make sure that the network does not consider the 3rd product by id and the 4th one more similar than the 3rd, 1st and 50th. – Alihan Urumov Nov 23 '21 at 13:57
  • I do not understand exactly what you want to do...What kind of input do you want to feed to your first LSTM layer? And what does your data look like? Can you add some examples to your question? – AloneTogether Nov 23 '21 at 14:11
  • yes you understood everything correctly on the inputs. input [(None, 3, 50)]. these are three orders, each of which contains 50 products. How to explain this to networks? if I submit without embedding, then the network perceives that 4 <5, and 5 <6. So that we consider them as objects and not as numbers for IDs, embeddings are made. those. Each basket contains embeddings of numbers. [1, 120, 98, ...., 47] – Alihan Urumov Nov 24 '21 at 10:42
  • and we have three such arrays (three orders, three time steps, so that the network understands what the user was buying before), so I enter the LSTM layer which will understand that first the first 50 embeddings were then 50 more and then 50 more)) so I think how to implement it? So that the network understands that these are id of goods and not a numerical value, and that the LSTM layer receives separately data about each order (3 orders in one pass) – Alihan Urumov Nov 24 '21 at 10:42
  • Updated answer. – AloneTogether Nov 24 '21 at 12:28