5

I would like to apply math operations dynamically between two loss functions or nn.Modules or python objects. It could be also treated as a problem to generate dynamic graphs in pytorch.

For example: In the below example, I would like to add two loss functions.

nn.L1Loss() + nn.CosineEmbeddingLoss()

If I do this, it gives me an error:

----> 1 nn.L1Loss() + nn.CosineEmbeddingLoss()
TypeError: unsupported operand type(s) for +: 'L1Loss' and 'CosineEmbeddingLoss'

I also tried creating a wrapper with forward function and torch operations like below, but it doesn’t work either. In the below case x and y can be any loss functions and op can be any math operation like addition, and subtraction, and so on.

class Execute_Op(nn.Module):
    def __init__(self):
        super().__init__()
        
    def forward(self, x, y, op):
        if op == 'add':
            return torch.add(x, y)
        elif op == 'subtract':
            return torch.subtract(x - y)

exec_op = Execute_Op()
exec_op(nn.L1Loss(), nn.CosineEmbeddingLoss(), 'add')

It gives error like the below:

Execute_Op.forward(self, x, y, op)
      5 def forward(self, x, y, op):
      6     if op == 'add':
----> 7         return torch.add(x, y)
      8     elif op == 'subtract':
      9         return torch.subtract(x - y)

TypeError: add(): argument 'input' (position 1) must be Tensor, not L1Loss

I am aware of functional APIs and the general way to pass truth values and predicted values to the loss function. But in that case, I cannot combine loss functions dynamically at run time.

I am not sure how exactly to implement it. But any help is really appreciated. Also, if there is a pythonic way or Pytorch way to do this, it would be great.

Edited:

  • I would like to call this function/class recursively.
Dan
  • 624
  • 6
  • 15
  • What precisely do you mean by "I cannot combine loss functions dynamically at runtime"? You can certainly return the sum or difference of two losses depending on some graph independent argument. That won't break the computation graph. – jodag Sep 04 '22 at 14:47
  • @jodag I would like to dynamically pick two loss functions and the math operator applied between two loss functions at run time. Please refer to the `Execute_Op` class definition for example. – Dan Sep 04 '22 at 14:51
  • 2
    I think you may be confusing a callable and a tensor. `obj1 = nn.L1Loss()` returns a callable class. I.e. `obj1` is not a tensor, it needs to be called at runtime with additional information passed to it and returns a tensor at that time. For example, if `loss1 = obj1(x, y)` then `loss` is a tensor and can be added or subtracted from other tensors. Similary if `obj2 = nn.CosineEmbeddingLoss()` then I can do `loss2 = obj2(x1, x2, y)`. You could then add or subtract `loss1` or `loss2` at runtime based on `op`. For example `loss = loss1 + loss2 if op == 'sum' else loss1 - loss2`. – jodag Sep 04 '22 at 15:14

1 Answers1

0

The problem here is that loss functions don't have a standard calling convention. Different loss functions take different arguments with different semantics. For example nn.L1Loss() takes two arguments: an input and a target of the same shape. On the other hand nn.CosineEmbeddingLoss() takes three arguments: two inputs of the same shape and a target of a different shape.

This means that the calling convention for ExecuteOp would depend on the specific loss functions chosen. For this reason I don't think its really a good idea to make it general with respect to the loss functions.

A simple solution without the additional level of generalization would be

class ExecuteOp(nn.Module):
    def init(self):
        super().__init__()
        self.l1_loss = nn.L1Loss()
        self.emb_loss = nn.CosineEmbeddingLoss()

    def forward(self, op, input_l1, target_l1, input1_emb, input2_emb, target_emb):
        assert op in ('add', 'subtract')
        loss1 = self.l1_loss(input_l1, target_l1)
        loss2 = self.emb_loss(input1_emb, input2_emb, target_emb)
        if op == 'add':
            return loss1 + loss2
        return loss1 - loss2

Then you could use this as follows:

exec_op = ExecuteOp()

...

# at training/val time, assuming inputs and targets are provided from your model and dataloader
loss = exec_op('add', input_l1, target_l1, input1_emb, input2_emb, target_emb)

Consider how you would even call exec_op at runtime if it were general w.r.t. loss functions. What arguments would you provide it, and in what order?

jodag
  • 19,885
  • 5
  • 47
  • 66