Which action is recommended to provide the HIGHEST accuracy model for the company's test and validation data?

A web-based company wants to improve its conversion rate on its landing page.Using a large historical dataset of customer visits, the company has repeatedly trained a multi-class deep learning network algorithm on Amazon SageMaker.However, there is an overfitting problem: training data shows 90% accuracy in predictions, while test data shows 70% accuracy only.The company needs to boost the generalization of its model before deploying it into production to maximize conversions of visits to purchases.

Which action is recommended to provide the HIGHEST accuracy model for the company’s test and validation data?

Increase the randomization of training data in the mini-batches used in training

Allocate a higher proportion of the overall data to the training dataset

Apply L1 or L2 regularization and dropouts to the training

Reduce the number of layers and units (or neurons) from the deep learning network

Explanations:

Increasing the randomization of training data can help the model generalize better. However, it does not directly address overfitting, and does not guarantee a significant improvement in test accuracy.

Allocating a higher proportion of data to training will improve the model’s ability to learn, but it does not address the issue of overfitting, which is the main concern here.

Applying L1 or L2 regularization and dropouts reduces the model’s complexity, helping to prevent overfitting and boosting generalization, leading to better performance on test data.

Reducing the number of layers and neurons might reduce overfitting, but it also risks underfitting and losing the model’s ability to learn complex patterns, which could lower accuracy on both training and test data.

Learn & move to cloud

Which action is recommended to provide the HIGHEST accuracy model for the company’s test and validation data?

Explanations:

Leave a Reply Cancel reply