How can the company implement the testing model with the LEAST amount of operational overhead?
Update the ProductionVariant data type with the new version of the model by using the CreateEndpointConfig operation with the InitialVariantWeight parameter set to 0. Specify the TargetVariant parameter for InvokeEndpoint calls for users who subscribed to the preview feature. When the new version of the model is ready for release, gradually increase InitialVariantWeight until all users have the updated version.
Configure two SageMaker hosted endpoints that serve the different versions of the model. Create an Application Load Balancer (ALB) to route traffic to both endpoints based on the TargetVariant query string parameter. Reconfigure the app to send the TargetVariant query string parameter for users who subscribed to the preview feature. When the new version of the model is ready for release, change the ALB’s routing algorithm to weighted until all users have the updated version.
Update the DesiredWeightsAndCapacity data type with the new version of the model by using the UpdateEndpointWeightsAndCapacities operation with the DesiredWeight parameter set to 0. Specify the TargetVariant parameter for InvokeEndpoint calls for users who subscribed to the preview feature. When the new version of the model is ready for release, gradually increase DesiredWeight until all users have the updated version.
Configure two SageMaker hosted endpoints that serve the different versions of the model. Create an Amazon Route 53 record that is configured with a simple routing policy and that points to the current version of the model. Configure the mobile app to use the endpoint URL for users who subscribed to the preview feature and to use the Route 53 record for other users. When the new version of the model is ready for release, add a new model version endpoint to Route 53, and switch the policy to weighted until all users have the updated version.
Explanations:
While updating the ProductionVariant data type with CreateEndpointConfig allows for model version control, it does not effectively facilitate the gradual rollout as required for the preview feature, which needs to utilize different traffic weights before a full release.
Configuring two SageMaker endpoints and using an ALB to route traffic introduces unnecessary complexity and operational overhead. This method requires maintaining two endpoints and an ALB, which is more complex than necessary for the task at hand.
Updating the DesiredWeightsAndCapacity data type with UpdateEndpointWeightsAndCapacities allows for dynamic management of model versions with minimal operational overhead. By initially setting the DesiredWeight to 0 for the new model and gradually increasing it, the company can effectively control traffic to the new version for testing and gradual rollout.
Using Route 53 with a simple routing policy complicates the rollout process. It requires managing DNS records and introduces latency issues due to DNS propagation times. This option is less efficient compared to using SageMaker’s built-in capabilities for traffic management.