What is the proper way to load a transfer learning model for inference in PyTorch?

Question

I am training a model using transfer learning based on Resnet152. Based on PyTorch tutorial, I have no problem in saving a trained model, and loading it for inference. However, the time needed to load the model is slow. I don't know if I did it correct, here is my code:

To save the trained model as state dict:

torch.save(model.state_dict(), 'model.pkl')

To load it for inference:

model = models.resnet152()
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs, len(classes))
st = torch.load('model.pkl', map_location='cuda:0' if torch.cuda.is_available() else 'cpu')
model.load_state_dict(st)
model.eval()

I timed the code and found that the first line model = models.resnet152() takes the longest time to load. On CPU, it takes 10 seconds to test one image. So my thinking is that this might not be the proper way to load it?

If I save the entire model instead of the state.dict like this:

torch.save(model, 'model_entire.pkl')

and test it like this:

model = torch.load('model_entire.pkl')
model.eval()

on the same machine it takes only 5 seconds to test one image.

So my question is: is it the proper way to load the state_dict for inference?

I have used both methods and the latter is the one I would advise using. On the former you first instantiate the model with random weights and then load the pre-trained ones so it makes sense that it is slower. — Carlos Hernandez Perez, Mar 09 '21 at 11:56

score 0 · Answer 1 · answered Mar 12 '21 at 00:03

In the first code snippet, you are downloading a model from TorchVision (with random weights), and then loading your (locally stored) weights to it.

In the second example you are loading a locally stored model (and its weights).

The former will be slower since you need to connect to the server hosting the model and download it, as opposed to a local file, but it is more reproduceable not relying on your local file. Also, the time difference should be a one-off initialisation, and they should have the same time complexity (as by the point you are performing inference the model has already been loaded in both, and they are equivalent).

What is the proper way to load a transfer learning model for inference in PyTorch?

1 Answers1