I'm confused on how to apply cross entropy loss for my time series model where the output is in the shape of [batch_size, classes, time_steps]
and target of shape [batch_size, time_steps, classes]
. I'm trying to made the model determine the confidence of the 16 classes at each timesteps. By using the following approach, I get a large loss and the model doesn't seems to be learning:
batch_size = 256
time_steps = 224
classes = 16
y_est = torch.randn((batch_size, classes, time_steps))
y_true = torch.randn((batch_size, time_steps, classes)).view(batch_size, classes, -1)
loss = torch.nn.functional.cross_entropy(y_est, y_true)
Do you think I've made a mistake here?