Data Interview 1: Machine Learning Engineer
This is a real MLE interview test that was conducted. My guess is other companies are still doing something similar to this right now. You might encounter something similar.
The questions listed in this post are from an actual Machine Learning Engineer role. The requirements for the role were Python, SQL, and 3 years of exp. The company was a tech company that has a software subscription that they sell for money. Since it’s a MLE, you know Python and SQL are used.
I highly encourage you to do these questions seriously, without looking at the solutions. Some of you need to work on SQL, while others need to work on your Python skills. Only way to find out which is to see what topics you struggle with.
If you’ve read The Interview Prep post, then you know you have 3 rounds of interviews:
HR
Technical
Team Lead
Most people flop on the second round, so, this series will focus on the 2nd. The 3rd interview with the team lead is you showing that you give a shit, and are competent. It’s pretty straight forward.
The questions in this post are a real representation of what you'll encounter.
Table of Contents:
Ice Breaker
SQL Technical Question
Python Technical Question
Additional Practice
1 - Ice Breaker
Interviewers like to start off with some icebreaker questions. You can think of these as easy softball questions. These are asked to develop some rapport for the interviewer. The point of these questions is to understand if you know how to “speak data”.
Again, these are super basic. Get any of these wrong, and it's lights out.
If you write down that you have 5+ years of experience with one of the below topics. Then, during the interview you, if you struggle with these basic questions. Your email, and name is now saved into our database with a black flag.
Black flag = never interview this guy. In other words, we caught you lying, and now you’ll never be allowed to interview at this company again. Here’s an example:
You say you have 5 years of experience with Python & Data Analysis
You struggle to answer a simple question like “What’s pandas?”
You now have the Black flag on your profile
1.1 Data Skills
What is Dimensionality Reduction & Why is it important?
Dimensionality Reduction refers to reducing the amount of features (columns) in your data. We do this because if we have too many columns, and they are correlated, this results in our ML model having a lot of noise, and taking a lot longer to compute.
Keep reading with a 7-day free trial
Subscribe to Data Science & Machine Learning 101 to keep reading this post and get 7 days of free access to the full post archives.