Data Scientist or Data Analyst?
Any industry specialist involved in data science and machine learning knows that there is an extreme disconnect between the true meaning of those terms and society’s general understanding of them. While it is understandable that the typical employee shouldn’t be expected to differentiate between Data Science and Machine Learning, one would expect companies that tout “AI” enabled programs, or programs utilizing “ML” would be able to differentiate all of these. Unfortunately, it seems that many of these companies hold the same misunderstanding as the general population. So how do we determine what qualifies as data science, machine learning or data analysis?
Defining Data Science
Industry specialists recognize the disconnect between terms, both in academia and business. MIT actually published an article titled “The Role of Academia in Data Science Education.” In this article, they propose that “data science is not a discipline but rather an umbrella term used to describe a complex process involving not one data scientist possessing all the necessary expertise, but a team of data scientists with nonoverlapping complementary skills.” After diving deeper into this article, it becomes clear that in fact the term “data science,” has really been pushed along by the tech industry. One term for the business use of “data science” which they discuss is “data wrangling, the ability to ‘bring structure to large quantities of formless data and make analysis possible.’” If you are familiar with data science, this definition sounds pretty spot on to the business use case of a “data scientist,” while database engineers prepare the data pipelines that stream data for analysis, the data scientist figures out what to make of the data and then presents on it. In fact, the term data scientist, as this article argues, became commonly used by recruiters to attract the correct graduates - since being a “statistician” did not mean you had handled large data sets or written production level code, and “computer scientist” didn’t necessarily mean that the candidate had processed data or queried databases.
How do we define data science? According to the article, MIT first makes the “one big distinction between backend and frontend data science.” “Backend” refers to “the [roles] that deal with hardware, efficient computing, and data storage infrastructure, or what is often referred to as data engineering.” They then define the “frontend” as the “tasks [sic] performed by data analysts and business intelligence engineers.” Data analysts and BI engineers “wrangle, explore, quality-assess, fit models to data, perform statistical inference, and develop prototypes.” Machine learning engineers “build and assess prediction algorithms and make the solution scalable and robust for many users.”
It’s clear that the general term “data scientist” can mean a lot of roles, and it’s important to curate a job description for what exactly you want your data scientist to be doing. Here is how AdaptiLab differentiates our tests for these various roles. First, our coding challenges assess five core competencies of data science and machine learning: Machine Learning Concepts, Data Processing, Data Analysis, SQL and General Programming.
While we test for all of these concepts no matter what specific domain, the ratio of questions pertaining to these five competencies differs depending on what position you are interviewing for. Our Machine Learning Engineer assessments focus primarily on General Programming, Data Processing and Machine Learning concepts. Our Data Science assessments have an emphasis on Data Processing, Data Analysis, and Machine Learning concepts. Data Analyst challenges have a heavy focus on SQL and Data Analysis. Finally, our Software Engineer focuses primarily on general programming and SQL.
We hope this post starts the conversation in your organization about how you can organize your “data science” job postings. If you would like to dive deeper into your business needs or learn more about the AdaptiLab challenges, please reach out here, and our team will be happy to talk more with you.