Step 1: Gain relevant skills
Before applying anywhere, it’s important to gain skills that get you hired. There are many relevant skills the interviewer is looking for in a machine learning engineer, so it’s good to gain as many of them as possible. Overall, a good machine learning engineer has a mix of computer science (raw programming skills), math and statistics (knowledge of what makes machine learning algorithms work), and specific subject knowledge. In this article, I will delve into computer science, math, and statistics parts.
One last skill to master is communication -- the ability to explain to others what you are working on is critical. Weeks can pass where none of your experiments yield better results and it’s in these moments where it’s important to have the skill to explain the problems and how to solve them.
What technologies should you master?
There are many programming languages, frameworks, and other technologies you can master to get a machine learning job. What you choose will determine what you will get hired for and thus what you will work on. You have to choose wisely. Here are some of my suggestions and the reason to master them:
- Python -- this programming language became the language to train machine learning algorithms. Make sure you know what features are available in the language. In addition to Python, I suggest you learn another language. Pick C++ if you want to work with embedded platforms. Choose Java if you want to use machine-learning models in an enterprise environment. Learn R if you want to do data analysis. And last but not least, learn Lisp if you like classical AI algorithms and natural language processing.
- TensorFlow/Pytorch -- deep learning is booming. There are two deep learning frameworks that dominate the market, but there are clear differences between who uses what. First of all, it’s important to note that traditionally TensorFlow was easier to use in a deployment environment, and Pytorch was easier to use for experimentation. Lately, TensorFlow is trying to make experimentation easier, and Pytorch is working towards easier deployability (even on embedded hardware). If you want more of a research role, I would recommend Pytorch, and if you want to work at a company that mainly wants to update its models in a production environment, I would learn TensorFlow. Overall learning one framework thoroughly is better than learning two separate ones halfway, especially considering the speed at which they are changing.
- Scikit-learn -- most classical machine learning algorithms are included in scikit-learn. Mastering it allows you to solve many small-data problems fast. If you know how the algorithms in this library work, you will have a head start during your technical interview.
- NumPy and Pandas -- when working with data in Python it’s important to efficiently select exactly the data you are interested in. Most machine learning engineers use NumPy daily for basic functions. If you have more advanced selection criteria, Pandas starts to shine! You will impress your future co-workers by selecting specific data samples with only a few lines of code (and without those pesky slow for-loops).
- Apache Spark -- companies that will benefit from machine learning will have a lot of data. Working with big data is very important, and Spark will massively accelerate your development effort here. Note that when working with big data it might also be interesting to learn about Hadoop. Initially, I would choose one technology to really master.
- OpenCV -- if you want to work with computer vision learning, OpenCV is simply vital. It contains a lot of image processing functions you can use to either quickly put a prototype together, or pre-process images in a better way. It also contains many approaches to recognizing things and features you can use for detecting or localizing objects.
What should you read?
There are multiple books that will teach you about useful machine learning skills. Different people have different learning styles and therefore will need to read different books. Make sure you buy the ones with a teaching style that motivates you.
For the math and statistics books I would recommend the following:
- Pattern Recognition and Machine Learning, by Bishop. This is a highly mathematical book that starts at the foundation of machine learning. It’s one of the toughest books I wrestled my way through for a university course. I would only recommend it if you are someone who likes math and the bottom-up approach to learning.
- Deep Learning, by Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Still a mathematical book, this time aimed at neural networks. I liked reading it after I already did some neural network experiments to get a better sense of what’s going on behind the scenes. Reading a book like this will let you cross the bridge from "doing random stuff with your network to see what happens" to "having a good intuition on what parameters control what aspect of the networks."
- Python Machine Learning, by Sebastian Raschka. This book is in the domain of practical books. Sebastian managed to give quite an extensive overview of available tools in scikit learn, and in his newest edition, he includes TensorFlow code.
- Artificial Intelligence: A Modern Approach, by Stuart J. Russell and Peter Norvig. What I primarily took away from this book were game theory and search algorithms. There is also a lot of probability theory and even some robotics! This book is very inspiring, and some chapters can give you an edge for some interviews.
For the software engineering skills, you should brush up on programming languages, and algorithms and data structures. Most jobs will require knowledge of Python, so make sure you know everything about this language, and it’s newest features. During your interviews, you might have to solve a whiteboard programming task, and here knowledge of algorithms and data structures is important. I would recommend these two books:
- Introduction to Algorithms by Cormen, Leiserson, Rivest, and Stein. This book has a very complete overview of different algorithms and data structures. It’s fun to implement the algorithms and data structures yourself in a language you are trying to master.
- Cracking the Coding Interview by Gayle Laakmann McDowell. Going through all algorithms will take a lot of time. This book will teach you the algorithms and data structures that will come up most often.
What courses should you try?
In terms of online courses, there are many options nowadays. I lately have not followed any courses, but I’ve heard good stories about the following:
- "Deep Learning" a Nanodegree from Udacity
- "Machine Learning" by Andrew Ng from Coursera
- "Become a Self-Driving Car Engineer" a Nanodegree from Udacity
What conferences should you go to?
I like visiting conferences or meetups because they give you an idea of current problems that can be solved with machine learning as well as what the state of the art techniques are, and what is being used in the industry. Academic conferences may be more about the state of the art techniques and commercial conferences may explore more applied techniques. Conferences I keep an eye on are:
- QCon to learn about the state-of-the-art in software
- NVIDIA’s GPU Technology Conference to learn about the state-of-the-art in machine learning (neural networks) on GPUs
- CVPR a computer vision conference
- NeurIPS a neural networks conference
- ICRA a robotics conference
- IROS a robotics conference
How should you practice for an interview?
One secret about interviews is that it’s easy to prepare for them. To assess whether a candidate is suitable for a job, companies give small challenges to the candidates. You can train yourself to be good at resolving these challenges! Here are some things you can do:
- Many companies will ask you to solve some HackerRank challenges before they even talk to you. Go on HackerRank yourself and practice by solving many of their challenges to prepare for the challenge the interviewers want you to solve.
- If you want to be better at competitive programming, you should check out some existing competitions. Some I like are Google Code Jam and the Advent of Code. Google Code Jam always has multiple programming challenges in various difficulty levels and they provide a writeup of the solution. This way you can quickly learn which algorithms are important! Each year in December, Advent of Code hosts 25 programming tasks that become gradually more difficult toward the end of the month. The best part of AoC is that there are often multiple ways to solve a problem, and exchanging your solutions with others will teach you a lot!
Think about one big machine learning project
Something I like to see in candidates is that they have experience in all domains for one big machine learning project. Think of all steps from data collection, data cleaning, machine learning, to deployment. Make sure you get all this experience in your current job or try to recreate this in your spare time. Having this experience will set you apart from a lot of your competition when applying.
One option to achieve this is contributing to an open-source project. When contributing to an existing project you will notice a harsh but useful review process. Once you manage to get your first pull request merged you will be a better developer who knows what other developers care about.
Step 2: Apply for a job
When applying for a job, make sure your application letter and CV demonstrate that you have the required capabilities for the job you are applying for and that your CV is readable and concise. It helps to list the frameworks you worked with, and projects you did in the past. Make sure you can answer questions about all frameworks and technologies you write down though! If it’s on your resume, it’s fair game to ask about. I would also recommend a professional recruiter or sourcer review your CV. They are happy to help you have a better chance of being hired!
One tip is to focus the structure of your CV for the job you want -- not for the things you did in the past. I adjust my CV for every job I apply to, so I can highlight the relevant projects I did in the past. Your CV should define which skills and projects you enjoy, and you should definitely highlight these. You should also look for jobs in your area of interest: companies love to hire people who are passionate about what they do!
Since machine learning is a growing field, many people are now following courses and trying to find a machine learning job. As an interviewer, it’s important to know if someone only "followed" some online courses, or also has "practical experience." If you used machine learning in your previous job, make it really clear on your CV. If you conducted a big project in your spare time, link to it on your CV!
Step 3: Interview
There are many aspects of machine learning that an interviewer might ask about. Make sure you know the basics of many algorithms (see the reading section). You should also be clear where your knowledge ends and what your strong points are. As an interviewer, I always gave a positive recommendation to anyone who taught me something new during the interview. Showing that you have deep knowledge of certain topics will make you more likely to be hired if the company is lacking that knowledge at the moment.
Be sure to ask clarifying questions during an interview. It’s easy to both talk about a different topic and not notice for a while. Ensure you are on the same page as the interviewer. As a practical example: I once thought I had to focus on writing an efficient sorting algorithm for a list, while the interviewer was only interested in the largest element. Whenever you get a coding question, make sure you start by writing a test case for the question. In my case, simply asserting that the outcome of the function I had to write was the largest element of a list would be good. This practice shows that you care about clarity and tests, which employers like. While writing a test in an interview you also have time to think and get a feeling for possible solutions and pitfalls before you program your solution.
If you are explaining an answer, use the STAR technique to structure your story. With the STAR technique, you explain what situation you were in, what task you had, what actions you took, and what the results were.
The first few times you go to an interview you will likely be nervous. This is normal, as you typically don’t know what to expect. Every interview is different, so even after many years of working in the industry, you will likely still be a bit nervous. Unfortunately, human minds are wired to favor a confident person, so it is super important to prepare! I offer the following tips to help you with your nerves:
- Often, interviewers start with some small talk at the beginning of the interview. Use this period to glean something about the interviewer. This always relaxes me and allows me to relate to his or her experience during the rest of the interview.
- Control your remote environment: make sure your phone is off, your pet is in a different room, and Alexa/Siri is turned off. If something happens (such as the postman delivering a parcel), just stay calm, excuse yourself, quickly handle the problem and come back to the interview. I personally never mind an interruption, they can happen to all of us.
- Practice interviews before! Have your friends ask you questions -- some can be found in the book Cracking the Coding Interview or by searching for frequent questions online. You can also work through several of these lists so you know the answers to the most common questions.
- Prepare an "elevator pitch" for yourself. This should summarize in about a minute who you are, what you did, and what you want. Every interviewer will ask you to introduce yourself, so this will be your perfect, memorized answer. Nothing is as relaxing as a great start to an interview.
- Know how you can convey information easily. I find it hard to express my thoughts with only words, so I like to stand in front of a whiteboard to draw out my thoughts as I talk. You might find a piece of paper is handy to help you get your thoughts across. Whatever you choose to do, make sure you practice it beforehand to see if you are comfortable.
What makes interviews hard is that you never know what to expect. I mainly experienced the following types of questions/interviews:
- Knowledge questions -- These can be questions such as "How does an LSTM work?" "How many weights are in a 2D convolutional layer?" "What is the kernel trick in support vector machines?". Say so if you don’t know anything about the topic or answer the question to the best of your knowledge. It is important that you are honest, as some candidates start making up things. You can also indicate how you would normally find the answer to such a question, or how you would approach solving this problem. To prepare for these questions, you have to go through textbooks to learn specific techniques and memorize specific facts and names.
If you are applying for a deep learning job, make sure you know the most used cost functions in deep learning (such as binary cross-entropy, categorical cross-entropy, mean squared error, cosine similarity, Huber loss, and Kullback-Leibler divergence).
Ensure you know common activation functions, and their derivatives (such as relu, elu, sigmoid, tanh, softmax, swish, and selu). And know the most common layers that can be found in a neural network (dense layers, convolutional layer, separable convolutions, batch normalization, global and local pooling [both max pooling and average pooling], LSTM, GRU, dropout).
In case you need to know about more classical machine learning, I would start with algorithms that are in scikit learn. Think about least-squares linear regression, support vector machines, nearest neighbors, decision trees, and ensemble methods. - Insight questions -- Interviewers might give you a small case during the interview, and ask you how to approach it. If you did a few machine learning projects in the past, you can often easily answer these questions. However, it’s good to re-learn common methods for common tasks.
Example questions would be:- What methods do you know to preprocess data?
- What methods would you use to augment data?
- What regularization methods do you know?
- How would you acquire/collect/annotate data for this task?
- How would you evaluate this task, and what performance should your model have?
- A coding interview -- Some companies use HackerRank so you can also execute the code you write -- others use a whiteboard, or just ask you to write on a piece of paper. Since this technique is common it’s important to practice it. On a computer, I would suggest that you try 20 lines of code before choosing the best one. On a piece of paper, you only have one chance to get it right. There are great tutorials on how to nail these types of interviews! One of the best tips here is to express your thoughts. It’s often not about finding the right solution, but showing that you are a person others can work with when solving problems that are of similar difficulty.
- Social questions -- There are common social questions, such as "What do others say about you?" "What are your biggest challenges?" "Where do you see yourself in X years?" Prepare for the common ones, and make sure you have a short compelling answer for them.
Step 4: The technical task
Many companies will give you a technical task to solve. I like these tasks, as you can show what you are capable of when given the time. When reviewing these tech tasks I saw many great solutions, but there were also people I had to reject. Here are some practical tips for solving a technical task:
- Describe your process. Always hand in a short report with the steps you took, what things you tried, and how you would improve it if you had more time. I like to use a Jupyter Notebook for tech tasks, as it allows you to describe and code at the same time! If you are not using a Jupyter Notebook, you can add a well-structured PDF document with your motivations.
- Keep it simple. Especially for machine learning tech tasks, there are often many solutions you can try. It’s better to solve a task simply and with a high degree of quality than attempting a difficult solution and end up with low quality.
- Make clear what code you copy-pasted, and what code you wrote yourself. For machine learning tasks there are often existing algorithms available with the functionality you have to implement. For example, if you have to make a 2D bounding box detector, you can use an existing YOLO implementation. However, always make sure you add a significant functionality yourself. People still have to assess you on your way of coding, so make sure to indicate what code you wrote and what code you copy-pasted.
- Choose quality over quantity. Make sure you adhere to conventions, coding guidelines, and add documentation to your functions. You can use a so-called linter, a tool that analyses your code for potential bugs and style violations. Always code as if you are going to commit what you hand into the master branch of your future employer. It’s better to have a simple piece of code that has a high code quality than a very advanced solution that can’t be understood.
- Unit tests! Part of being a professional software engineer is writing tests for the code you wrote. Simple assert statements for functions go a long way, and showing that you know a specific unit test framework will impress the reviewer of your tech task.
Conclusion
Hopefully, this helps you find your ultimate dream job! Machine learning is a lot of fun and has the potential to solve a lot of problems. I based this article on my experience talking to people at conferences and meetups, but I realize that this doesn’t give me a complete view of the world of machine learning. If you have suggestions regarding what to learn, or what would be important to your field of work, please leave a comment to help others who want to become who you already are.
About the Author
Roland Meertens worked as a machine learning engineer at Autonomous Intelligent Driving (AID) on perception for autonomous vehicles. He writes for InfoQ and has a blog at pinchofintelligence.com where he lists his side-projects.