Which techniques should the company use for feature selection?
(Choose three.)
Data scaling with standardization and normalization
Correlation plot with heat maps
Data binning
Univariate selection
Feature importance with a tree-based classifier
Data augmentation
Explanations:
A correlation plot with heat maps helps visualize the relationships between features and the target variable (sales price). Strong correlations can indicate important features, helping in feature selection.
Univariate selection involves statistical tests to evaluate each feature individually, helping to identify the most relevant ones for predicting the target variable.
Feature importance with a tree-based classifier (e.g., decision trees, random forests) is a widely used technique that ranks features based on their contribution to the model’s prediction accuracy.
Data scaling with standardization and normalization is not a feature selection technique. These processes are used to standardize the range of the data, but they do not help in selecting important features.
Data binning is a technique for transforming continuous variables into categorical ones, but it is not a method of feature selection. It may simplify data but does not directly address selecting the most important features.
Data augmentation is a technique primarily used to increase the size of training data, typically in image and text datasets. It does not help in selecting features that are most relevant for the target variable.