Which algorithm will meet these requirements?
K-nearest neighbors (k-NN) with dimension reduction
Linear learner with early stopping
K-means
Principal component analysis (PCA) with the algorithm mode set to random
Explanations:
K-nearest neighbors (k-NN) is effective for classification tasks and can find similar data points for test data. Combining it with dimension reduction (e.g., PCA) helps reduce memory costs while retaining essential features, making it suitable for the company’s requirements.
The linear learner with early stopping may be efficient in training but does not inherently facilitate finding similar data points, which is a key requirement. Additionally, it may not sufficiently reduce memory usage if the dataset is large with many features.
K-means is a clustering algorithm, not a classification algorithm. While it groups data points, it does not classify them into predefined categories, nor does it directly support finding similar data points in a supervised learning context.
Principal component analysis (PCA) is a dimensionality reduction technique, not a classification algorithm. While it can help reduce memory costs, it does not classify or find similar data points on its own. The algorithm mode set to random does not affect its suitability for the classification task.