Which approach will meet these requirements with the LEAST development effort?

A machine learning (ML) specialist at a manufacturing company uses Amazon SageMaker DeepAR to forecast input materials and energy requirements for the company.Most of the data in the training dataset is missing values for the target variable.The company stores the training dataset as JSON files.The ML specialist develop a solution by using Amazon SageMaker DeepAR to account for the missing values in the training dataset.

Which approach will meet these requirements with the LEAST development effort?

Impute the missing values by using the linear regression method. Use the entire dataset and the imputed values to train the DeepAR model.

Replace the missing values with not a number (NaN). Use the entire dataset and the encoded missing values to train the DeepAR model.

Impute the missing values by using a forward fill. Use the entire dataset and the imputed values to train the DeepAR model.

Impute the missing values by using the mean value. Use the entire dataset and the imputed values to train the DeepAR model.

Explanations:

Using linear regression to impute missing values adds unnecessary complexity and development effort. DeepAR can handle missing values natively, so pre-imputation is not needed.

DeepAR can inherently deal with missing values in the target variable by treating them as NaN. This minimizes development effort, as it does not require any data imputation.

Forward fill is a method of imputation that can introduce bias by carrying forward previous values. It is not necessary with DeepAR, which can manage missing values directly.

Imputing missing values with the mean can distort the data distribution and may lead to inaccuracies. Additionally, DeepAR can handle missing values, making this approach unnecessary and more complex.

Learn & move to cloud

Which approach will meet these requirements with the LEAST development effort?

Explanations:

Leave a Reply Cancel reply