What is random forest? Why is it good?

Data Science Interview QuestionsCategory: Data ScienceWhat is random forest? Why is it good?
1 Answers
MockInterview Staff answered 7 years ago

Random forest? (Intuition):
– Underlying principle: several weak learners combined provide a strong learner
– Builds several decision trees on bootstrapped training samples of data
– On each tree, each time a split is considered, a random sample of mm predictors is chosen as split candidates, out of all pp predictors
– Rule of thumb: at each split m=p–√m=p
– Predictions: at the majority rule
Why is it good?
– Very good performance (decorrelates the features)
– Can model non-linear class boundaries
– Generalization error for free: no cross-validation needed, gives an unbiased estimate of the generalization error as the trees is built
– Generates variable importance

Your Answer

14 + 10 =