I want to develop a GRU-based model for variant length input data. So I think I should use the while statement in the forward and then break it when all of the sequences were processed. Will it affect the torch graph? Does this disturb the network gradient and the learning?
For example:
def forward(self, x):
state = self.initial_state
out = []
for i in range(x.size(0)):
state = self.rnn(x[i,], state)
out.append(state)
if condition:
break
return out, state
I searched but I didn't find any related information about it, and I don't know if this method is correct or not.