Which solution will meet these requirements with the LEAST operational overhead?
Create a new SageMaker endpoint for the new model. Configure an Application Load Balancer (ALB) to distribute traffic between the old model and the new model.
Modify the existing endpoint to use SageMaker production variants to distribute traffic between the old model and the new model.
Modify the existing endpoint to use SageMaker batch transform to distribute traffic between the old model and the new model.
Create a new SageMaker endpoint for the new model. Configure a Network Load Balancer (NLB) to distribute traffic between the old model and the new model.
Explanations:
Creating a new SageMaker endpoint and configuring an Application Load Balancer (ALB) would introduce unnecessary complexity and operational overhead for distributing traffic, as it requires additional infrastructure and management without leveraging SageMaker’s built-in capabilities for model versioning.
Modifying the existing endpoint to use SageMaker production variants allows for traffic distribution between the old and new models seamlessly. This method uses SageMaker’s built-in functionality for A/B testing and provides an easy way to evaluate model performance under real production traffic with minimal operational overhead.
Using SageMaker batch transform is not suitable for real-time traffic distribution; it is designed for processing large batches of data rather than handling live user requests, making it unsuitable for the requirements of evaluating a new model in production.
Creating a new SageMaker endpoint and configuring a Network Load Balancer (NLB) for traffic distribution adds unnecessary complexity and operational overhead similar to Option A, and it does not leverage SageMaker’s capabilities for managing model versions effectively.