3

I'm currently working on relation classification. I train a word embedding matrix and I would like to use it into a tensorflow model. However, in my dataset, some words are unknown. I use the same way as proposed in Using a pre-trained word embedding (word2vec or Glove) in TensorFlow.

I would like to know if there is a way in tensorflow to use automatically a nul vector to represent unknown words. Currently, I add an extra column to the word embedding for such words (nul vector) but I would like to update the word embedding matrix during the training without modifying the column for unknown words.

Moreover, I also use this column to pad my sentences.

Is there a way to do it automatically in Tensorflow ?

Community
  • 1
  • 1
XogoX
  • 139
  • 3
  • 14
  • 1
    I guess the best thing is to create one nul vector the padding tokens, one fo the unknown words, where for padding tokens trainable=False and unknown words trainable=True and finally, use tf.concat(0, [word_embedding, unknown vector, padding vector]) and adjust the indices – XogoX Oct 30 '16 at 15:46

0 Answers0