Question source: Shape Science, blog on data science and algorithms
- You can break this question into two components. Model selection & variable selection — typically this question is asked to get a high-level understanding and the interviewer may follow-up with questions to dive deeper
- Model selection depends on what’s more important: Accuracy, Interpretability & computation time. If you need accuracy, you might want to try lots of algorithms and see which works best on your data. if Interpretability is important then you can do something simple as linear regression or decision trees which are easy to interpret (and are not black box like neural network). If speed is important then you can’t do SVM for instance.
- Feature selection: See other answers on this topic.