Is it better to design robust or accurate algorithms?

Data Science Interview QuestionsCategory: Data ScienceIs it better to design robust or accurate algorithms?
MockInterview Staff asked 10 months ago
3 Answers
MockInterview Staff answered 10 months ago
  • The ultimate goal is to design systems with good generalization capacity, that is, systems that correctly identify patterns in data instances not seen before
  • The generalization performance of a learning system strongly depends on the complexity of the model assumed
  • If the model is too simple, the system can only capture the actual data regularities in a rough manner. In this case, the system has poor generalization properties and is said to suffer from underfitting
  • By contrast, when the model is too complex, the system can identify accidental patterns in the training data that need not be present in the test set. These spurious patterns can be the result of random fluctuations or of measurement errors during the data collection process. In this case, the generalization capacity of the learning system is also poor. The learning system is said to be affected by overfitting
  • Spurious patterns, which are only present by accident in the data, tend to have complex forms. This is the idea behind the principle of Occam’s razor for avoiding overfitting: simpler models are preferred if more complex models do not significantly improve the quality of the description for the observations
  • Quick response: Occam’s Razor. It depends on the learning task. Choose the right balance
  • Ensemble learning can help balancing bias/variance (several weak learners together = strong learner)


MockInterview Staff answered 10 months ago

Is it better to spend 5 days developing a 90% accurate solution, or 10 days for 100% accuracy? Depends on the context?

  • “premature optimization is the root of all evils”
  • At the beginning: quick-and-dirty model is better
  • Optimization later

Other answer:
– Depends on the context
– Is error acceptable? Fraud detection, quality assurance

Kumar answered 2 months ago

Depends on the type of the project, I would prefer to build a model first and then tune the model depending on the requirement. For example, If you are building a model for a financial client like a credit card fraud detection your model should be as accurate and robust as possible. so, In such cases its better to take more time and improve the performance of a model. Whereas if you are building a model to classify the data into groups in such cases I prefer to build a model as quickly as possible to get an idea about patterns hidden in the data.

Your Answer