I am learning seq2seq model using https://github.com/keon/seq2seq. I have successfully run the original project. Then, I want to train my-self translation model.
For myself data, the following code is OK. But for second batch, out of memory
is reported for loss.backward. The second batch data is smaller than the first one.
src, trg = src.cuda().T, trg.cuda().T
optimizer.zero_grad()
output = model(src, trg)
loss = F.nll_loss(output[1:].view(-1, vocab_size),
trg[1:].contiguous().view(-1),
ignore_index=pad)
loss.backward(retain_graph=True)
torch.nn.utils.clip_grad_norm_(model.parameters(), grad_clip)
optimizer.step()
total_loss += loss.data.item()
torch.cuda.empty_cache()
For batch size = 16, the above code is OK for first batch, and out of memory
is reported at loss.backward for second batch.
The used gpu for first batch is:
2233 MB src, trg = src.cuda().T, trg.cuda().T
2331 MB output = model(src, trg)
4772 MB loss.backward(retain_graph=True)
6471 MB optimizer.step()
5312 MB torch.cuda.empty_cache()
For batch size = 1, the above code is OK for first batch, and out of memory
is reported at loss.backward for second batch.
Any suggestion is appreciated!