I am trying to use an RNN model that outputs bus routes and its input is the demand matrix. The bus routes are then used in a simulation which spits out a metric of how the routes performed. The question is, since there is no target value of bus routes, how do I back propagate the simulation result?
To explain the question with simple python code:
"""
The model is an RNN that takes 400,24,24 matrix as input
dimension 0 represents time, dimension 1 represents departure bus stop and dimension 2 represents the arrival bus stop. Each value is a count of the number of passengers who departed at a bus stop with an arrival bus stop in mind in a specific time
output is 64,24 matrix which will be reshaped to 8,8,24
dimension 0 is the sequence index, dimension 1 is the index of bus (there are 8 buses), dimension 2 is the softmaxed classifier dimension of 24 different bus stops. From the output, 8 bus stops are picked per bus with a sequence
These sequences are then used for path generations of buses and they are evaluated from a simulation
"""
model.train()
optimizer.zero_grad()
out = model(demand)#out is 64,24 demand is 400,24,24
demand, performance = simulation(out)#assume performance as float
#here the out has grad_fn but the performance does not
loss = SOME_NUMBER - performance
loss = torch.FloatTensor(loss)
#here I need to back propagate and it is the confusing part
#simply doing loss.backward() does nothing because no grad_fn
#out.backward() requires 64,24 gradients computed somehow from 1 #metric, causes complete divergence within few steps
optimizer.step()