I am trying to define a log loss function for a multi class classification problem as:
self.loss = tf.losses.log_loss(
labels=self.sampled_actions,
predictions= [self.probability[i][self.sampled_actions[i]] for i in range(tf.shape(self.sampled_actions)[0])],
weights=self.discounted_rewards)
Here, self.sampled_actions
is a 1D tensor of 0/1/2
(e.g: [0,1,2,1,0,2]
) which corresponds to which action is the ground truth. self.probability
is defined as:
h = tf.layers.dense(
self.observations,
units=hidden_layer_size,
activation=tf.nn.relu,
kernel_initializer=tf.contrib.layers.xavier_initializer())
self.probability = tf.layers.dense(
h,
units=3,
activation=tf.sigmoid,
kernel_initializer=tf.contrib.layers.xavier_initializer())
As the probabilities of all three actions, 0,1,2 for any given observation in the input.
However, when I run this program, I get the error:
Traceback (most recent call last):
File "spaceinvaders.py", line 68, in <module>
hidden_layer_size, learning_rate, checkpoints_dir='checkpoints')
File "/home/elfarouk/Desktop/opengym/policy_network_space_invaders.py", line 49, in __init__
predictions= [self.probability[i][self.sampled_actions[i]] for i in range(tf.shape(self.sampled_actions)[0])],
TypeError: range() integer end argument expected, got Tensor.
Is there a way to specify that my prediction in the loss function should be dependent on the sampled_actions?