Why use feature selection? If two predictors are highly correlated, what is the effect on the coefficients in the logistic regression? What are the confidence intervals of the coefficients?

Data Science Interview QuestionsCategory: Data ScienceWhy use feature selection? If two predictors are highly correlated, what is the effect on the coefficients in the logistic regression? What are the confidence intervals of the coefficients?
1 Answers
MockInterview Staff answered 6 years ago

Answer from Analytics Vidhya
While working on a data set, how do you select important variables? Explain your methods.
Answer: Following are the methods of variable selection you can use:

  1. Remove the correlated variables prior to selecting important variables
  2. Use linear regression and select variables based on p values
  3. Use Forward Selection, Backward Selection, Stepwise Selection
  4. Use Random Forest, Xgboost and plot variable importance chart
  5. Use Lasso Regression
  6. Measure information gain for the available set of features and select top n features accordingly.

Your Answer

12 + 16 =