2

I'm using Tensorflow to train a word2vec skip gram model. The computation graph is in the code below:

# training data
self.dataset = tf.data.experimental.make_csv_dataset(file_name, batch_size=self.batch_size, column_names=['input', 'output'], header=False, num_epochs=self.epochs)
self.datum = self.dataset.make_one_shot_iterator().get_next()
self.inputs, self.labels = self.datum['input'], self.datum['output']

# embedding layer
self.embedding_g = tf.Variable(tf.random_uniform((self.n_vocab, self.n_embedding), -1, 1))
self.embed = tf.nn.embedding_lookup(self.embedding_g, self.inputs)

# softmax layer
self.softmax_w_g = tf.Variable(tf.truncated_normal((self.n_context, self.n_embedding)))
self.softmax_b_g = tf.Variable(tf.zeros(self.n_context))

# Calculate the loss using negative sampling
self.labels = tf.reshape(self.labels, [-1, 1])
self.loss = tf.nn.sampled_softmax_loss(
                weights=self.softmax_w_g,
                biases=self.softmax_b_g,
                labels=self.labels,
                inputs=self.embed,
                num_sampled=self.n_sampled,
                num_classes=self.n_context)

self.cost = tf.reduce_mean(self.loss)
self.optimizer = tf.train.AdamOptimizer().minimize(self.cost)

But after 25 epochs, loss values begin to increase. Is there any reason for this? loss in each epochs

Thuc Hung
  • 130
  • 1
  • 1
  • 7
  • Did you find the issue? I'm having a similar pattern, wondering if this is just normal due to the nature of sampling in the loss function. – gergf Jan 27 '23 at 09:35

0 Answers0