Which algorithms are best suited to this scenario?
(Choose two.)
Latent Dirichlet allocation (LDA)
Random forest classifier
Neural topic modeling (NTM)
Linear support vector machine
Linear regression
Explanations:
Latent Dirichlet Allocation (LDA) is a well-known statistical model for topic modeling, which identifies abstract topics in large collections of text data. It is well-suited for discovering topics in audit documents.
Random Forest classifier is a supervised learning algorithm used for classification or regression, not for discovering topics in text. It doesn’t model topics in an unsupervised manner.
Neural Topic Modeling (NTM) is another advanced approach for topic discovery, leveraging deep learning techniques to learn more complex and richer topics, making it suitable for the task.
A Linear Support Vector Machine (SVM) is a supervised classifier, not an unsupervised algorithm for topic modeling. It would require labeled data to classify documents, which is not the case here.
Linear Regression is a regression model and is not used for topic modeling. It is designed for predicting continuous variables, not for extracting abstract topics from text.