Which combination of steps is the MOST operationally efficient way for the data scientist to maintain the model's accuracy?

A data scientist at a financial services company used Amazon SageMaker to train and deploy a model that predicts loan defaults.The model analyzes new loan applications and predicts the risk of loan default.To train the model, the data scientist manually extracted loan data from a database.The data scientist performed the model training and deployment steps in a Jupyter notebook that is hosted on SageMaker Studio notebooks.The model’s prediction accuracy is decreasing over time.

Which combination of steps is the MOST operationally efficient way for the data scientist to maintain the model’s accuracy?

(Choose two.)

Use SageMaker Pipelines to create an automated workflow that extracts fresh data, trains the model, and deploys a new version of the model.

Configure SageMaker Model Monitor with an accuracy threshold to check for model drift. Initiate an Amazon CloudWatch alarm when the threshold is exceeded. Connect the workflow in SageMaker Pipelines with the CloudWatch alarm to automatically initiate retraining.

Store the model predictions in Amazon S3. Create a daily SageMaker Processing job that reads the predictions from Amazon S3, checks for changes in model prediction accuracy, and sends an email notification if a significant change is detected.

Rerun the steps in the Jupyter notebook that is hosted on SageMaker Studio notebooks to retrain the model and redeploy a new version of the model.

Export the training and deployment code from the SageMaker Studio notebooks into a Python script. Package the script into an Amazon Elastic Container Service (Amazon ECS) task that an AWS Lambda function can initiate.

Explanations:

Using SageMaker Pipelines to automate the entire workflow, including data extraction, model training, and deployment, ensures continuous model maintenance and improved operational efficiency. It avoids manual intervention.

Configuring SageMaker Model Monitor with an accuracy threshold allows for detection of model drift. Linking it with CloudWatch and SageMaker Pipelines automates the retraining process when the model’s accuracy drops below the threshold.

While this option involves checking prediction accuracy, using Amazon S3 and SageMaker Processing for daily checks is not as efficient as an automated workflow with SageMaker Pipelines. It requires manual monitoring and intervention.

Manually rerunning the steps in a Jupyter notebook to retrain and redeploy the model is not efficient for continuous monitoring and retraining. It involves manual intervention and doesn’t scale well.

Exporting the code to an ECS task initiated by AWS Lambda adds unnecessary complexity for model retraining and deployment. SageMaker Pipelines offers a more streamlined and efficient solution.

Learn & move to cloud

Which combination of steps is the MOST operationally efficient way for the data scientist to maintain the model’s accuracy?

Explanations:

Leave a Reply Cancel reply