If I were interviewing you for a Data Mining position what would be the top 10 algorithms I would expect you to know in order of priority?
- Linear regression
- Logistic regression
- k-means
- SVMs
- Random Forests
- Matrix Factorization/SVD
- Gradient Boosted Decision Trees/Machines
- Naive Bayes
- Artificial Neural Networks
For the last one I’d let you pick one of the following:
- Bayesian Networks
- Elastic Nets
- Any other clustering algo besides k-means
- LDA
- Conditional Random Fields
- HDPs or other Bayesian non-parametric model