Data Analyst vs. Data scientist vs. Data Engineer:

Data Science vs Data Analyst

There is a lot of confusion in the industry about these three roles and it’s because of two things:

  1. Some of these roles are relatively new and so there isn’t a clear consensus in the Industry about what distinguishes each of these roles (it doesn’t help that there is some overlap in responsibilities)
  2. Small and Mid-size startups/companies don’t have the data maturity & need to pay for all three roles so there might be just one person doing it all! (which is perfectly fine but as a candidate you need to know what you are getting into)

Even though there is some confusion, as a candidate applying for these roles, you need to be able to make sure that the skills that you have can align with the position that the employer is looking to fill.  Also, don’t just look at the advertised job title. Even if it says “Data Scientist”, sometimes the employer is looking for a Sr Data Analyst and you need to figure that out.

So, how do you figure out what the employer is looking for? In this blog, We’ll share a framework that could help you dig deeper. Also, don’t be afraid to ask these questions to your hiring manager for further clarification.

Framework:

You need to figure out what questions are being asked by business. They can be categorized into three buckets:

  1. What
  2. Why
  3. What’s next

Now let’s map each of these questions to the role.

Data Engineer:

  • What:

What are my sales number for this quarter?

What is the profit for this year to date?

What are my sales number over the past 6 months?

What did the sales look like same quarter last year?

All of these questions are used to report on facts; Lot of these questions can be answered by manual data pulls by a Jr. Analyst but most organizations want to automate and put a self-service platform in place. Data engineers are called upon to automate data pipelines and build a central location to house all the data to help answer “what” questions. Also, the central location becomes a go-to place for data analysts and data scientist to query the data that they need.

Tools used: ETL, Hadoop, Spark, Python, SQL

Data Analyst:

  • Why:

Why are my sales number higher for this quarter compared to last quarter?

Why are we seeing an increase in sales over the past 6 months?

Why are we seeing a decrease in profit over the past 6 months?

Why does the profit this quarter less compared to same quarter last year?

All of these questions try to figure why something happened? A data analyst typically takes a stab at this. He might use existing platform built by data engineers to pull data and/or also merge other data sets. He/she then applies data analysis techniques on the data to answer the “why” question and help a business user get to the actionable insight.

Tools used: SQL, Excel, Tableau, R/Python (Basic)

Data Scientist:

  • What’s next:

What will be my sales forecast for next year?

What will be our profit next year for Scenario A, B & C?

Which customers will cancel/churn next quarter?

Which new customers will convert to a high-value customer?

All of these questions try to “predict” what will happen next (based on historical data/patterns). Sometimes, you don’t know the questions in the first place so there’s a lot of pro-active thinking going on and usually, a data scientist is doing that. Sometimes you start with a high-level business problem and create “hypothesis” to drive your analysis. All of these can be classified under “data science”.

Tools used: R/Python (advanced) + Tools used by Data Analyst

Conclusion:

We hope that this post helps you distinguish between these roles and you are able to figure out what role is the best fit based on your skills. [Source: Paras Doshi’s Blog (Insight Extractor)]

Once you are ready to apply for Data Analyst and Data science roles, sign up for MockInterview.co to start practicing!