The following code ran without error on pytorch nightly (1.5.0.dev20200206), however once I installed the stable 1.5 build the RNN Forward method defined below started to throw an error:
def forward(self, sequence):
print('Sequence shape:', sequence.shape)
sequence = sequence.clone().view(len(sequence), 1, -1)
print("flattened shape: ", sequence.shape)
lstm_out, hidden = self.lstm(
sequence, self.hidden
)
print(lstm_out.shape)
out_space = self.hidden2out(lstm_out[:, -1])
self.hidden = hidden
print("hiddens")
print(hidden[0].shape)
print(hidden[1].shape)
print(" out_space: ", out_space.shape)
out_scores = torch.sigmoid(out_space)
print("out_scores: ", out_scores.shape)
out = out_scores.squeeze()
print(out.shape)
return out
I added the clone()
function to prevent in place memory modifications from view()
and made variable assignments obviously not in place. However I still get the following error:
Sequence shape: torch.Size([200, 19, 62])
flattened shape: torch.Size([200, 1, 1178])
torch.Size([200, 1, 8])
hiddens
torch.Size([1, 1, 8])
torch.Size([1, 1, 8])
out_space: torch.Size([200, 1])
out_scores: torch.Size([200, 1])
torch.Size([200])
Warning: Error detected in AddmmBackward. Traceback of forward call that caused the error:
File "main.py", line 240, in <module>
main_loop(args)
File "main.py", line 115, in main_loop
train.run(args)
File "/data/learnedbloomfilter/python/classifier/train.py", line 519, in run
args.log_every,
File "/data/learnedbloomfilter/python/classifier/train.py", line 88, in train
predictions = model(features)
File "/data/miniconda3/envs/lbf/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/data/learnedbloomfilter/python/classifier/embedding_lstm.py", line 65, in forward
sequence, self.hidden
File "/data/miniconda3/envs/lbf/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs)
File "/data/miniconda3/envs/lbf/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 570, in forward
self.dropout, self.training, self.bidirectional, self.batch_first)
(print_stack at /opt/conda/conda-bld/pytorch_1587428190859/work/torch/csrc/autograd/python_anomaly_mode.cpp:60)
Traceback (most recent call last):
File "main.py", line 240, in <module>
main_loop(args)
File "main.py", line 115, in main_loop
train.run(args)
File "/data/learnedbloomfilter/python/classifier/train.py", line 519, in run
args.log_every,
File "/data/learnedbloomfilter/python/classifier/train.py", line 97, in train
loss.backward(retain_graph=True)
File "/data/miniconda3/envs/lbf/lib/python3.7/site-packages/torch/tensor.py", line 198, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/data/miniconda3/envs/lbf/lib/python3.7/site-packages/torch/autograd/__init__.py", line 100, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [8, 32]], which is output 0 of TBackward, is at version 2; expected version 1 instead. Hint: the backtrace
further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
I have isolated the error to forward()
, but can't find the intermediate tensor [torch.FloatTensor [8, 32]]
that seems to be causing the problem (none of the tensor shapes in my forward method match, so it must be in the lstm forward()
method). I only use CPU, not cuda.
For the rest of the rnn code see this gist: https://gist.github.com/yaatehr/aac21cae05b24101f2369c97cfecb47b
Thanks!