Which solution will meet these requirements with the LEAST operational overhead?
Create a service-linked role for Amazon Elastic Container Service (Amazon ECS) with access to the S3 bucket. Create an ECS cluster that is based on an AWS Deep Learning Containers image. Write the code to perform the feature engineering. Train a logistic regression model for predicting the price, pointing to the bucket with the dataset. Wait for the training job to complete. Perform the inferences.
Create an Amazon SageMaker notebook with a new IAM role that is associated with the notebook. Pull the dataset from the S3 bucket. Explore different combinations of feature engineering transformations, regression algorithms, and hyperparameters. Compare all the results in the notebook, and deploy the most accurate configuration in an endpoint for predictions.
Create an IAM role with access to Amazon S3, Amazon SageMaker, and AWS Lambda. Create a training job with the SageMaker built-in XGBoost model pointing to the bucket with the dataset. Specify the price as the target feature. Wait for the job to complete. Load the model artifact to a Lambda function for inference on prices of new houses.
Create an IAM role for Amazon SageMaker with access to the S3 bucket. Create a SageMaker AutoML job with SageMaker Autopilot pointing to the bucket with the dataset. Specify the price as the target attribute. Wait for the job to complete. Deploy the best model for predictions.
Explanations:
This solution involves using Amazon ECS and AWS Deep Learning Containers, which requires managing infrastructure and writing custom code for feature engineering, model training, and inference. This adds operational overhead and complexity, which doesn’t meet the requirement of least overhead.
Although SageMaker notebooks are useful for exploring data and building models, this option requires manual exploration of feature engineering, transformations, and hyperparameters. It is not the most efficient or automated approach, and it doesn’t directly address the need for a fast, low-overhead solution.
Using a Lambda function for inference with a manually trained XGBoost model adds operational overhead. Lambda may not be ideal for handling the necessary computational power for machine learning model inference, and manual feature engineering and training are needed, which increases complexity.
SageMaker AutoML with SageMaker Autopilot automates the entire process, from feature engineering to model selection and hyperparameter tuning. It is a fully managed service that minimizes operational overhead, meets the need for better accuracy, and allows for quick deployment of the best model with minimal intervention.