I want to deploy a new model to an existing AWS SageMaker endpoint. The model is trained by a different pipeline and stored as a mode.tar.gz in S3. The sagemaker endpoint config is pointing to this as the model data URL. Sagemaker however doesn't reload the model and I don't know how to convince it to do so.
I want to deploy a new model to an AWS SageMaker endpoint. The model is trained by a different pipeline and stored as a mode.tar.gz in S3. I provisioned the Sagemaker Endpoint using AWS CDK. Now, within the training pipeline, I want to allow the data scientists to optionally upload their newly trained model to the endpoint for testing. I dont want to create a new model or an endpoint config. Also, I dont want to change the infrastructure (AWS CDK) code.
The model is uploaded to the S3 location that the sagemaker endpoint config is using as the
model_data_url
. Hence it should use the new model. But it doesn't load it. I know that Sagemaker caches models inside the container, but idk how to force a new load.
This documentation suggests to store the model tarball with another name in the same S3 folder, and alter the code to invoke the model. This is not possible for my application. And I dont want Sagemaker to default to an old model, once the TargetModel
parameter is not present.
Here is what I am currently doing after uploading the model to S3. Even though the endpoint transitions into Updating state, it does not force a model reload:
def update_sm_endpoint(endpoint_name: str) -> Dict[str, Any]:
"""Forces the sagemaker endpoint to reload model from s3"""
sm = boto3.client("sagemaker")
return sm.update_endpoint_weights_and_capacities(
EndpointName=endpoint_name,
DesiredWeightsAndCapacities=[
{"VariantName": "main", "DesiredWeight": 1},
],
)
Any ideas?