What is random forest? Why is it good?

Data Science Interview QuestionsCategory: Data ScienceWhat is random forest? Why is it good?
1 Answers
MockInterview Staff answered 6 years ago

Random forest? (Intuition):
– Underlying principle: several weak learners combined provide a strong learner
– Builds several decision trees on bootstrapped training samples of data
– On each tree, each time a split is considered, a random sample of mm predictors is chosen as split candidates, out of all pp predictors
– Rule of thumb: at each split m=p–√m=p
– Predictions: at the majority rule
Why is it good?
– Very good performance (decorrelates the features)
– Can model non-linear class boundaries
– Generalization error for free: no cross-validation needed, gives an unbiased estimate of the generalization error as the trees is built
– Generates variable importance
Source

Your Answer

8 + 6 =