Which solution satisfies these requirements with MINIMAL effort?
Build and host multiple models in Amazon SageMaker. Create multiple Amazon SageMaker endpoints, one for each model. Programmatically control invoking different models for inference at the application layer.
Build and host multiple models in Amazon SageMaker. Create an Amazon SageMaker endpoint configuration with multiple production variants. Programmatically control the portion of the inferences served by the multiple models by updating the endpoint configuration.
Build and host multiple models in Amazon SageMaker Neo to take into account different types of medical devices. Programmatically control which model is invoked for inference based on the medical device type.
Build and host multiple models in Amazon SageMaker. Create a single endpoint that accesses multiple models. Use Amazon SageMaker batch transform to control invoking the different models through the single endpoint.
Explanations:
While using multiple endpoints in Amazon SageMaker allows for hosting different models, managing multiple endpoints requires more operational overhead. The application layer control adds complexity and may not allow for easy adjustments in serving proportions over time.
Using Amazon SageMaker endpoint configuration with multiple production variants allows for easy management of multiple models. The model serving proportions can be adjusted programmatically by updating the endpoint configuration, which minimizes effort in long-term monitoring and testing of different model versions.
Amazon SageMaker Neo is primarily for optimizing models for deployment on different devices and platforms rather than managing multiple versions of a model in parallel. This option does not meet the requirement of controlling inference proportions between different versions.
Amazon SageMaker batch transform is intended for batch processing of data rather than real-time inference. Using a single endpoint to access multiple models in this way does not allow for dynamic control over inference proportions, making it less suitable for the stated needs of long-term version comparison.