What is better: good data or good models? And how do you define “good”? Is there a universal good model? Are there any models that are definitely not so good?

Data Science Interview QuestionsCategory: Data ScienceWhat is better: good data or good models? And how do you define “good”? Is there a universal good model? Are there any models that are definitely not so good?
1 Answers
MockInterview Staff answered 5 years ago
  • Good data is definitely more important than good models
  • If quality of the data wasn’t of importance, organizations wouldn’t spend so much time cleaning and preprocessing it!
  • Even for scientific purpose: good data (reflected by the design of experiments) is very important

How do you define good?
– good data: data relevant regarding the project/task to be handled
– good model: model relevant regarding the project/task
– good model: a model that generalizes on external data sets
Is there a universal good model?
– No, otherwise there wouldn’t be the overfitting problem!
– Algorithm can be universal but not the model
– Model built on a specific data set in a specific organization could be ineffective in other data set of the same organization
– Models have to be updated on a somewhat regular basis
Are there any models that are definitely not so good?
– “all models are wrong but some are useful” George E.P. Box
– It depends on what you want: predictive models or explanatory power
– If both are bad: bad model
Source

Your Answer

2 + 20 =