Top 10 DM / ML Algorithms

If I were interviewing you for a Data Mining position what would be the top 10 algorithms I would expect you to know in order of priority?

  1. Linear regression
  2. Logistic regression
  3. k-means
  4. SVMs
  5. Random Forests
  6. Matrix Factorization/SVD
  7. Gradient Boosted Decision Trees/Machines
  8. Naive Bayes
  9. Artificial Neural Networks

For the last one I’d let you pick one of the following:

  • Bayesian Networks
  • Elastic Nets
  • Any other clustering algo besides k-means
  • LDA
  • Conditional Random Fields
  • HDPs or other Bayesian non-parametric model
comments powered by Disqus