Which methods can the Data Scientist use to improve the model performance and satisfy the Marketing team’s needs?

By: study aws cloud

On: January 11, 2025

Tagged: Machine Learning Specialty

With: 0 Comments

A Data Scientist is building a model to predict customer churn using a dataset of 100 continuous numerical features.The Marketing team has not provided any insight about which features are relevant for churn prediction.The Marketing team wants to interpret the model and see the direct impact of relevant features on the model outcome.While training a logistic regression model, the Data Scientist observes that there is a wide gap between the training and validation set accuracy.

Which methods can the Data Scientist use to improve the model performance and satisfy the Marketing team’s needs?

(Choose two.)

Add L1 regularization to the classifier

Add features to the dataset

Perform recursive feature elimination

Perform t-distributed stochastic neighbor embedding (t-SNE)

Perform linear discriminant analysis

Explanations:

Adding L1 regularization (Lasso) helps reduce overfitting by penalizing irrelevant features, thus improving generalization and reducing the gap between training and validation accuracy.

Adding more features can worsen overfitting, especially in the absence of insights from the Marketing team. It may increase complexity without improving model performance.

Recursive feature elimination (RFE) helps identify and remove irrelevant or redundant features, improving model performance by selecting only the most relevant features.

t-SNE is a dimensionality reduction technique used for visualization. It does not directly improve model performance or address issues like overfitting or feature relevance.

Linear Discriminant Analysis (LDA) is a dimensionality reduction technique used for classification but is not specifically designed to address overfitting or feature selection in logistic regression.

Previous Post: Which solution meets these requirements with the LEAST amount of effort?

Next Post: You can use the AWS Management Console to view key operational metrics for your DB Instance deployments, including ____?

Leave a Reply Cancel reply