My goal is to build a model that predicts next character. I have built a model and here is my training loop:
model = Model(input_size = 30,hidden_size = 256,output_size = len(dataset.vocab))
EPOCH = 10
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
init_states = None
for epoch in range(EPOCH):
loss_overall = 0.0
for i, (inputs,targets) in enumerate(dataloader):
optimizer.zero_grad()
pred = model.forward(inputs)
loss = criterion(pred, targets)
loss.backward()
optimizer.step()
As you can see I return only predictions of the model, but not cell_state
and hidden_state
.
So alternative is : pred,cell_state,hidden_state = model.forward(inputs)
My question is: should I do it for the prediction of characters task? Why/why not? And in general: when should I return my hidden and cell state?