To expand a bit on the previous answers: there are two different guidelines in the PyTorch documentation on how to save a model, based on what you want to do with it later when you load it again.
- If you want to load the model for inference (i.e., to run predictions), then the documentation recommends using
torch.save(model.state_dict(), PATH)
.
- If you want to load the model to resume training then the documentation recommends doing a bit more, so that you can properly resume training:
torch.save({
'epoch': epoch,
'model_state_dict': model.state_dict(),
'optimizer_state_dict': optimizer.state_dict(),
'loss': loss,
...
}, PATH)
In terms of moving those saved models into s3, the modelstore open source library could help you with that. Under the hood, this library is calling those same save()
functions, creating a zip archive of the resulting files, and then storing models into a structured prefix in an s3 bucket. In practice, using it would look like this:
from modelstore import ModelStore
modelstore = ModelStore.from_aws_s3(os.environ["AWS_BUCKET_NAME"])
model, optim = train() # Your training code
# The upload function takes a domain string to organise and version your models
model_store.pytorch.upload("my-model-domain", model=model, optimizer=optim)