6

I'm implementing some RL in PyTorch and had to write my own mse_loss function (which I found on Stackoverflow ;) ). The loss function is:

def mse_loss(input_, target_):    
    return torch.sum(
        (input_ - target_) * (input_ - target_)) / input_.data.nelement()

Now, in my training loop, the first input is something like:

tensor([-1.7610e+10]), tensor([-6.5097e+10])

Input tensor

With this input I'll get the error:

Unable to get repr for <class 'torch.Tensor'>

Computing a = (input_ - target_) works fine, while b = a * a respectively b = torch.pow(a, 2) will fail with the error metioned above.

Does anyone know a fix for this?

Thanks a lot!

Update: I just tried using torch.nn.functional.mse_loss which will result in the same error..

bene
  • 798
  • 7
  • 19
  • Isn't repr a builtin function? (https://docs.python.org/3/reference/datamodel.html?highlight=__repr__#object.__repr__) – hello_world Jun 26 '18 at 05:44
  • Seems like.. But why would it try to call this method? The tensor is of type float32 and this error occurs not only in debug mode but also when running "normal".. – bene Jun 26 '18 at 08:49
  • That is very weird PyTorch behavior. Maybe update? – hello_world Jun 27 '18 at 14:42

2 Answers2

2

I had the same error,when I use the below code

criterion = torch.nn.CrossEntropyLoss().cuda()
output=output.cuda()
target=target.cuda()
loss=criterion(output, target)

but I finally found my wrong:output is like tensor([[0.5746,0.4254]]) and target is like tensor([2]),the number 2 is out of indice of output

when I not use GPU,this error message is:

RuntimeError: Assertion `cur_target >= 0 && cur_target < n_classes' failed.  at /opt/conda/conda-bld/pytorch-nightly_1547458468907/work/aten/src/THNN/generic/ClassNLLCriterion.c:93
Steve H
  • 21
  • 3
  • I had the same error in a different context. The problem was also that I was trying to access a tensor with an index larger than the tensor length. – GR4 Aug 16 '19 at 10:11
1

Are you using a GPU ?

I had simillar problem (but I was using gather operations), and when I moved my tensors to CPU I could get a proper error message. I fixed the error, switched back to GPU and it was alright. Maybe pytorch has trouble outputing the correct error when it comes from inside the GPU.

NonoG
  • 11
  • 1
  • No, unfortunately I'm running my code on the cpu. Also, there is, right now, no code to "move" it to a cpu - could this be a problem? – bene Jun 27 '18 at 15:44