This post is a follow up from BT_Bear’s original post. Click here to see the original.
Still A Good Career?
Key question: is it still a good career choice? As always, it depends. The “innovation and disruption hype” made a lot of budgets available for ML and AI projects roughly between 2015 and 2020. I have the impression that the spotlight is moving towards Environment Social Governance (ESG) topics but have not seen good analysis on it. We are looking at a much more mature landscape today, which has advantages and disadvantages. Expect less easy budget but more actual value delivered.
I generally believe that the overall trend is still very positive, there is tremendous potential for automation and improvement left in all industries. As with all evolving technologies, you must evolve with it if you want to claim a share of the pie.
I´ll share a few perspectives that hopefully help you to navigate the field.
General Trends in the field
Not all machine learning roles are equal. The field has matured a lot over the past years, there are distinct roles as BT_Raptor described here:
In addition, I would argue that image or language-based ML are almost a separate specialization. Today you need to pay some attention to what kind of ML role you are looking at.
I encourage you to try a few things in the field but do not jump around too erratically. Early stage recruiting processes are incredibly dumb and you do not want to be filtered out because the HR person had no clue and did not find the right keywords in your CV. Click here to see how to fix your resume.
Don´t fade the subject matter.
It got a bit better in recent years, but the sentiment that ML can make intelligent decisions by itself is still around. I have seen more than one highly decorated PHD making a complete fool of themselves with their data science insights, plenty arrogance and zero understanding of the subject matter. You will rarely get a second chance after that.
Today I do not believe anymore that you can consistently solve real problems without some understanding of the subject matter. This does not mean that you need a degree in the field. A bit curiosity, humility and willingness to learn from experts goes a long way.
Teamwork
Which leads me nicely to teamwork. Yes, you will find examples of good ML based products fully build and marketed by a single person. But usually anything at relevant scale is a effort by several people:
This is not just because the amount of work but also because the variety of tasks can rarely be covered well by a single person. My minimum viable setup would be somebody in the functional role that will use the ML product, somebody from IT and a data scientist/ML expert. What does this mean for you? I typically recommend a “T-shaped skill profile”
The main idea is to have some basic understanding of what your teammates are talking about to allow effective communication.
General Trends in Technology
Core machine learning is not where most projects fail today. As of 2022 there are plenty good ML systems available on premise or from big cloud vendors. Picture any of them, and it will most likely enable you to develop the models you need. A successful ML application does however require a more complex architecture that at least includes data ingestion and integration with the point of decision making. This is where you better be sure early that the overall architecture makes sense. This mistake is a quick route to the POC graveyard where many good models are buried because nobody figured out how to make them work in the larger landscape.
Do not marry your tools: Python coding is not your core skill.
The history of IT is the history of better tools taking over. (almost) nobody codes in assembler today because more abstract languages are just the better trade off. Data scientists often look down at the auto ML tools, but they are improving and already offer good enough performance for many cases. Guess what will happen if you can choose between a 90% model today and a 95% model after 2 months of data scientist work? Are the 5% extra quality worth the cost? I would encourage you to keep an open mind and adopt better tools early if they make sense.
Don´t be blinded by pure model performance, the effort needs to make sense for the given use case.
Embedded ML
I am convinced that we will see a lot more embedded ML for specific solutions. Take salesforce, the OG of SaaS: https://salesforce.com/products/einstein/ai-deep-dive/… At least their sane customers stick fairly close to standard processes and data models in salesforce, so it is very similar. Very good chances to build models that work for a lot of customers, don´t you think? I would bet that such solutions will improve a lot in the near future so that custom models make less and less sense for many use cases.
So is it all hope lost for the data scientist? I beg to differ.
Let me sketch out one way to think about it. Let us assume you work in manufacturing – you have a bunch of different machines and tons of know how in your factory to build what you need to build.
Consider that usually the best models are build by whoever has the most data. Sure, sometimes somebody just combines public data in a way nobody thought of but that is the exception.
So, what could this mean for you in this scenario?
I would argue that you are in a relatively bad position to develop predictive maintenance models for individual machines in your factory – the maker of the machines can easily have data for many more machines than you have. But data for your whole process? This is what only you got. This would be a promising angle to take. And probably also where you want to beat your competitors.
And this is how I would think about it: What is the core value creation of your company? Where can ML be used as a leverage? I am sure you will find plenty
BT_Raptor Note: This is why I constantly hammer in working on projects that will generate revenue for your firm. At the end of the day, a business will happily pay you $150k, if you can generate more revenue than that per year.
Don´t try to build what already works well
This ties into the last point, but there are many well working solutions on the market today. Really want to compete with google on translation?
Why?
To be clear: All vendors overpromise and often their solutions only work in narrow circumstances, so you need to test carefully. BowTiedCelt nicely described the advantages and disadvantages of server less offerings from various cloud providers.
In addition, all the big cloud providers offer various ML services, many of them are pretty good and worth a try before you invest in custom builds.
Example: AWS
It is always a trade-off but the general outline stands: just because you can build it yourself does not mean that it is good use of your time.
If this scares you, you are thinking about it in the wrong way – it means that you individually can get more done and will create so much more value if you embrace the automation.