3

Running pytorch 0.4.1 with python 3.6 I encountered this problem:
I cannot torch.save my learning rate scheduler because python won't pickle a lambda function:

lambda1 = lambda epoch: epoch // 30
scheduler = LambdaLR(optimizer, lr_lambda=lambda1)
torch.save(scheduler.state_dict(), 'scheduler.pth.tar')

results with an error

PicklingError: Can't pickle <function <lambda> at 0x7f7583fe92f0>:
attribute lookup <lambda> on __main__ failed

How can I save my scheduler?


I know that using a proper function instead of a lambda function for lambda1 can be saved, but I need a lambda function because I want to be able to control this function when it is defined (for instance I want to be able to change the fixed 30 in the denominator).
How can this be done and still allows me to save the scheduler?

Umang Gupta
  • 15,022
  • 6
  • 48
  • 66
Shai
  • 111,146
  • 38
  • 238
  • 371
  • You can use dill to save the pickle instead of `torch.save`. `lambda` function can't be pickled https://bugs.python.org/issue19272 – Umang Gupta Oct 12 '18 at 03:13
  • Also see https://stackoverflow.com/questions/25348532/can-python-pickle-lambda-functions – Umang Gupta Oct 12 '18 at 03:14
  • @UmangGupta indeed using `dill` can solve the issue, but I'd rather stick to pytorch's save method – Shai Oct 12 '18 at 03:53

2 Answers2

8

If one wishes to stay with default behavior of torch.save and torch.load, the lambda function can be replaced with a class, for example:

class LRPolicy(object):
    def __init__(self, rate=30):
        self.rate = rate

    def __call__(self, epoch):
        return epoch // self.rate

The scheduler is now

scheduler = LambdaLR(optimizer, lr_lambda=LRPolicy(rate=30))

Now the scheduler can be torch.saveed and torch.load without alternating the pickling module.

Shai
  • 111,146
  • 38
  • 238
  • 371
  • You could also write a function that returns a function, eh? – Umang Gupta Oct 15 '18 at 22:13
  • 1
    @UmangGupta the returned function would then be a `local` of the function creating it - python won't `pickle` it. I tried. – Shai Oct 16 '18 at 05:01
  • This is hands down the correct answer. Anybody can explain why this one works? – Furkan Küçük May 10 '20 at 10:00
  • @FurkanKüçük, because when it gets loaded it can find a definition of that class. It will fail to load if this class has not been imported, say, in another script where you just load the dict, but not the class. – stason Aug 09 '20 at 01:43
  • Correct me if I'm wrong but this is not a good permanent solution. As @stason mentioned, if the file defining the `LRPolicy` class is renamed, moved or deleted, `torch.load` will fail with an `AttributeError `, making the checkpoint unusable. – Janosh Sep 09 '22 at 23:13
2

You can use a dill as the pickle module instead of default pickle.

import dill
torch.save(scheduler.state_dict(), 'scheduler.pth.tar', pickle_module=dill)

and to load

import dill
object = torch.load('scheduler.pth.tar', pickle_module=dill)

See documentation of save for more options.

Umang Gupta
  • 15,022
  • 6
  • 48
  • 66
  • Could you include a sentence what differentiates `dill` from `pickle` in this case? Also, the docs link seems broken. – Janosh Sep 09 '22 at 23:01