Data Science interview preparation
Your Homework, Behavioral Interview questions, technical knowledge, the final round, and some stuff you should brush up on.
Table of Contents
Frequently Asked Questions (FAQ)
Introduction
Your Homework
Behavioral Interview questions
Technical knowledge
Final interview questions
Conclusion
This is a part 3 of the Hiring series, you can see the other parts below:
Note: The data analytics interviews are very similar to this. Just take out some of the more difficult technical interview questions.
Frequently Asked Questions (FAQ)
1) What should I focus on when overpreparing for a data science job interview?
As data science interviews try to evaluate your expertise and capabilities in data science. You should thoroughly study data science concepts and train yourself to answer related questions.
Key areas to concentrate on are: data mining, machine learning, big data platforms like Hadoop and Spark, probability and statistics, SQL and NoSQL databases, cloud services such as AWS and Azure, and programming languages like Python and R. Additionally, honing your skills through coding practice is highly beneficial.
2) How should I present myself in a data science interview?
Begin with a brief introduction of yourself and a quick summary of your experience. Then proceed to discuss your resume, your knowledge about their company and industry, and the minor details in your resume.
Keep in mind, if an interviewer initiates the conversation with a request for you to introduce yourself. It generally indicates they have yet to review your resume.
3) What types of questions can I expect during data science interviews?
You’ll be questioned about your familiarity with data mining, predictive modeling, and statistical analysis, along with software such as SAS, SPSS, R, and MATLAB. Be ready to discuss the projects you've done and your approach to each.
Expect problem-solving and critical thinking questions as well, such as discussing a business problem and your solution strategy. You may be asked to explain a specific data analysis or machine learning algorithm.
Lastly, anticipate basic technical questions regarding Python, R, or SQL.
4) How do I get ready for the final round of the job interview?
The final round usually involves a more informal chat with the data science team's manager. The manager will assess your fit for the role and get a sense of you as a person. If you've reached this stage, you've effectively demonstrated your abilities, and the team leader is primarily interested in understanding who they'll be collaborating with in the years ahead. Treat this final round as a behavioral interview with a hint of technical skill evaluation.
The team leader aims to understand your thought process. Possible questions may include:
Tell me about yourself
Your strengths and weaknesses
Why do you want to work for our company?
How do you handle an overwhelming workload?
Why are you interested in data science?
What expectations do you have of us/me as a manager?
If you excel in this interview (emerging as the top candidate), the team leader will confirm your email address, and you can expect an official job offer letter within about 3 hours. If you were good but not the top choice, expect the job offer letter within about 24 hours.
5) How do I ready myself for the second round of the interview?
The technical round of the data scientist interview tests your ability to solve intricate data problems. You might be asked to detail your problem-solving approach, alongside live coding in a language like Python or R.
To gear up for this data science live coding interview, ensure you're comfortable with the basics of data analysis and machine learning. Familiarity with common algorithms and libraries used in data science is a must. Practicing coding solutions to complex data problems is beneficial. Look at platforms like Quora for practice interview questions. Also, you may find it helpful to use a substack to review any topics you want to brush up on, including data manipulation.
This post was created with the help of the BowtiedBrothers, check them out here
Introduction
So you finally got that email back for the data science job you applied to a couple weeks ago saying they’d like to set up an interview?
Congratulations!
That in of itself is an accomplishment that you should be proud of, but now it’s time to get to work and get ready to show off your best self to the interviewers. After personally going through this process for the past 4 months, I wanted to share some advice on things you can expect during the data scientist interview process and how to start preparing for it.
Just as for most tech jobs, you’ll need to go through multiple rounds of behavioral and technical interviews. The most consistency you’ll find will be within the behavioral interviews as there’s only a limited amount of ways to ask “tell me about yourself?”. Technical interviews on the other hand can range wildly in what they entail and will vary company to company. That said, I’m going to give a broad overview that will help you take a step in the right direction in setting yourself up for success.
***DISCLAIMER: because “Data Scientist” is still a catch-all term there isn’t a standard set of job responsibilities, this is just a rough outline of what you COULD see in the interview process, during your phone screening with HR feel free to ask “What does the interview process for this position look like?” to get a better idea of what to expect***
You can click here to learn more about the many different data scientist roles in the data science industry.
Your homework
Before you go to your interview, you'll want to go on ahead and do an actual official data science interview preparation. This preparation will include studying the company, the industry, and your resume to figure out why you were potentially called in. This piece will also tell you what topics to brush up on.
Research the Company
Start off by spending some time looking at the job description. Did they mention any specific libraries there on purpose, or was it more of a generic job description that anyone could've made up? If they actually mentioned specific libraries, you know why that is....
Look at the above job posting, they make it extremely clear that the focus for this position will be on the neural networks, some deep learning, with a slight focus on some machine learning models. If you spending a lot of your time learning the many different ways to perform linear regression, you are wasting precious time.
Glassdoor Interview questions
Glassdoor is a great resource because it shows you some of the most common interview questions that specific company will ask. This is great for acing the behavioral section, without spending too much time actually taking the interview seriously yet. Just hop on over to Glassdoor, and take a quick note of all the behavioral related questions, and just have a decent answer ready for them.
Here is an example for what you get for Facebook's (Meta) data scientist interview questions
Their advantage over the competitors
Remember this and remember this well, a business either has a competitive advantage over others in it's industry which makes their customers want to continue repurchasing from them, or they will get devoured by their competition. Figure out quickly what this company does well, that their competitors don't.... then sit back and wait till the final round interview, and if you make it there, drop that coup de grace on them, and blow your entire competition away.
Research the Industry
Something that will help you out a lot in your interview prep stage is if you can figure out what questions you'll be asked before they ask you them.... Here's how you can find out what you'll be asked:
Go on google, type in the industry that the company works in, and then type in data science afterwards, then after that, spend some time reading how other in this specific industry are using data science, and what sort of problems they are trying to solve. What you are looking for here are certain topics that show up very frequently. For example, if you are interviewing for a hedge fund, others in the finance industry use data science for several time series related analysis, so you will probably be grilled on your time series knowledge, and your technical knowledge for several different machine learning algorithms, and some biases associated with time series analysis.
Here's another example, if you were going in for an interview, where the industry uses data science to do some computer vision, you will probably be grilled on your knowledge for deep learning neural networks, and if you can read the schematics of a model, and then create the exact same thing in either tensorflow, or keras, you'll also probably be grilled on some linear algebra and dimensionality reduction.
You get the idea, figure out how others in the same industry use data science, and what problems they are trying to solve.
What is the current hot topic?
Another thing to study is to see what is the current problem that everyone is trying to solve in the industry, if you are called in for some sort of a research/revenue generating role, if you can literally tell them some stuff like what their competitors are doing, the results they got, and how you did something similar as well... Congrats, you immediately slot yourself in the top candidates.
What could they potentially be interested in using data science for?
If you have quite a lot of time to spare, then one thing you could potentially do is spend some time researching the company, and going through some of their news to see what projects they have already done, what projects other competitors in their industry have already done, and do a quick mis-match to get a solid idea on what projects they'll most likely send your way.
Study Your Resume
Now that you know how they plan on using your data scientist skillset, and have already gotten a solid idea on what sort of data scientist interview questions you'll be asked, go ahead and spend some time studying your own resume as well. What keywords did you put in there, which match the skills they are looking for. Congrats, now you just figured out what questions they'll ask you during the technical interview.
Data Science interview questions
Behavioral Interviews
As with all types of behavioral interviews regardless of the job: Tell your story, paint a picture for the interviewer as to why you’re the perfect fit for the role.
These interviews will consist more of a discussion of the company, the role specifically, and then questions regarding your background, interest in the role, and then most likely basic questions along the line of “tell me about a time you needed to work with a team to solve a difficult problem.”
Talking Points
To prep for these interviews you’ll need to get your personal story straight. By this I mean having a list in your head of concise talking points that relate to either the question asked or to data science in general.
The best talking points will come from:
Internship or Co-op experience
Personal Projects
University Coursework or Research
For any of these you will want to discuss the overall goal of the project, the scope and timeframe, challenges or anything you needed to alter along the way, any final results and/or any next steps that would have led to improvement, and (if possible) how the results were used (ex: sales increased 20% upon deployment of model).
Personal Interest
The interviewer will most likely ask why you want this role specifically. A good approach would be to discuss your overarching professional goals and how it ties into both the job and company (ex: “over the next 5 years I would like to gain in-depth industry knowledge while also being exposed to leadership opportunities that will assist in my personal growth. After researching your company and this role, I decided this would be exactly what I’m looking for.”).
For a more specific answer, tie in your previous experiences and personal interests, which could include highlighting job responsibilities you really enjoyed or excelled at, or even something you didn’t like. For example, if you were previously a data analyst and wanted to pivot into data science, you could say that you felt limited by the work you were doing and wanted to do something more technically intensive.
In terms of a personal interest in data science, this could include:
Desire to use data/machine learning/statistics to solve challenging business problems
Interest in software/technology
Company Specific stuff
Another good point to emphasize can be the desire to work for a company at a particular size. If the company is smaller or a startup, you can emphasize how you want to be part of a rapidly growing organization where you'll be able to have a larger impact (not another cog in the machine). On the other hand if the company is large, you can point out their plethora of professional resources and how you want to be part of an industry leader.
There are a multitude of ways to be able to frame why you would be the perfect fit for them so get creative.
At the end of the interview they may ask if you have any questions for them. This is the perfect time to show off some of your expertise and that you’ve been listening to them by asking something unique to that company. It could be something about their industry, how their departments or teams are organized, any interesting clients that were brought up, etc.
Note: Be sure to tailor your questions to something the interviewer can answer (i.e. a manager may be better suited to answering a higher level question while a senior data scientist could answer a technical question)
Your questions to them
If you are having trouble thinking of anything specific to ask, here are some common simple questions:
Tell me about your story, where you started and why you chose to come here?
What are some benefits about working here? Any challenges?
Where do you see yourself in the next 3-5 years?
My final go-to question to ask is some variation of:
“Is there anything else you’d like to know about me or any concerns that you have that would prevent me from moving forward that I can clarify?”
This will give you one more chance to clear up anything that they may still be wondering about and address any concerns.
Once you nail getting your story down to the point where you can give clear and concise answers about your experiences and background, it’s time to move on to studying for the technical sections of the interview process.
Data Science interview questions
Technical knowledge
Depending on your background experience, this can be the most daunting portion of the interview process and the part you will want to prepare most for. Here are the most common things you can expect:
Probability and Statistics Analysis
First and foremost, they’ll want to know how strong of a grasp you have on probability and statistics. To test this, you may need to answer a series of questions that can range from basic problems to more advanced statistical concepts. The best way I’ve found to study for this is to start by going through a list of as many probability/statistics vocabulary words as I can, defining them, and describing how/why the topic is used from memory. Here’s a good list to get you started.
From there I then work through example problems that apply these concepts. Sample questions are very easy to find online, but here’s a good primer that also includes some ML related questions:
What is machine learning?
What are the differences between supervised and unsupervised learning?
What are the differences between inference and prediction?
What is the data analysis process?
How do you handle missing data?
What is a neural network?
What is a deep learning network?
How do you train a neural network?
What is backpropagation?
What is the difference between regression and classification?
What are the differences between a tree based approach (random forest) vs a clustering based approach (K-NN)?
How do you evaluate the performance of a machine learning model?
What are some of the biggest challenges facing machine learning today?
Solve real world problems
Here is where you will want to showcase your data analysis skills, and show how well you read the job description, and your domain knowledge for this specific industry. Depending on the domain of the company, you'll either be asked about linear regression, natural language processing, linear algebra, or a huge heavy emphasis on the machine learning algorithms, or sometimes they can be lazy and just chuck a data set at you, and then analyze the prediction quality you submit.
A lot of times the company will give you a case study to walk them through so they can get a feel for your thought process. This problem will almost certainly be related to the company you’re interviewing with, so to begin studying you will want to research what the company specifically does as well as the industry it operates in. For example, if the industry is insurance you’ll want to research what machine learning models are commonly used, any compliance issues that need to be taken into account, and the processes that you potentially could be involved with (such as fraud prevention or claims processing).
The case study itself usually starts with a presentation of background information about the case (be sure to take notes) and then a series of questions related to the problem. What they want to see is your thought process in the approach you take to solving them so be sure to tell them not only what steps you would take but why you are taking them. These questions can involve anything ranging from statistics, machine learning, business specific topics, to even technical deep dives.
Basic SQL Query / Python or R technical skills
Some companies will require you to take a Python or SQL test (sometimes both). Either type could be done in an online code editor while being monitored by the interviewer. For the SQL test specifically, they will most likely have you create queries looking for specific sets of data. You should already have experience using SQL but if you want a refresher, this is what I always go to for simple examples. You should also brush up on some data structures that are available in Python.
Here is a SQL test we have available on the substack, you can find the answers here
For a Python test, there are a couple things they could have you do. First, have you perform some data wrangling to clean up an example dataset. Second, ask you to analyze data, which may include the use of packages like NumPy or Pandas, and give some high-level analysis of it. Third, they may have you create some form of data visualization for a data set and have you describe any potential trends. From there they could have more niche questions involving what sort of ML models could be used for this type of data and any pros or cons, but that will be dependent on the company and position. Once you are finished with this, you'll be moving onto the next data science interview, which will be the final round with the team lead.
Bonus:
If the position is more computer science or software intensive, they may require a coding test to be done through a 3rd party. A couple popular companies that administer and proctor these exams are Hackerrank and Mercer Mettl. Usually what you can do is just look up example tests on Google and solution walkthroughs on YouTube. To work through example coding problems yourself, LeetCode is a great place to go.
Final Interviews
Let’s assume you ace each set of interviews and get to the final round. This final round can be multiple interviews that include more behavioral and technical sections, as well as a 1-on-1 with a higher up decision maker. Depending on how large the company is, this could be with your future manager, a director, partner, VP, or even a founder. The questions they ask will generally be higher-level in nature so they most likely won’t want you to get into the technical weeds too much. This is also a good place to discuss the industry as a whole, how the company is handling certain market competitors, and what specific advantages the company has that make it unique. You most likely will already have researched these things in your initial research of the company but before this interview you will want to brush up so you can come prepared with detailed questions to impress them.
FYI: This is also the part where you will want to speak with the team lead on what makes a good data scientist as well.
Conclusion
What your interview process looks like will depend entirely on the company. Luckily there are some commonalities that can be found so you hopefully won’t go in completely blindsided. If in doubt, during your phone screening with HR you can directly ask them “What does the interview process look like?”. Each section of the interview process could be broken up into separate posts entirely because there’s just so much to talk about, but this overview is enough to help you get started. Be ready to study hard and good luck!
If you have any questions about the data science interview process or even anything data science related, feel free to reach out to me on Twitter.