This was a post written by BT_Bear. You can find them here.
Artificial stupidity
Introduction
Artificial intelligence is not the newest kid on the hype block anymore, but still fairly hot – so fair game to pop the overblown expectations bubble a bit 😊
(futurama - via https://soothfairy.com/2016/04/24/artificial-stupidity-again/)
Nothing in the AI tech stack is intelligent by itself. The right application can lead to very intelligent solutions – and a few mistakes can lead to the complete opposite. So let´s have some fun with the projects that did not go so well and hopefully learn something.
Should I have to decide between funny and insightful at any point, the learning will have to wait. This is the internet after all.
Case 1: ball or bald?
Let´s have a look at this short clip:
Not ideal, right?
We should probably have a little look under the hood, at least as far as this is possible:
https://www.pixellot.tv/blog/behind-the-scenes-of-automated-production-how-does-it-work/
Lot´s of fluff and too little technical details for my taste, so I must rely on some wild speculation based on what is in the text.
“Deep Learning takes the shape of a very efficient, high end, real time algorithm that is required to run in less than 10 milliseconds per frame for every frame (!!!) to keep the system stable. This means that is has to produce the game instantaneously, just as it is actually happening. This process is very complicated.
The minimal requirement for Deep Learning in sports is to identify the ball and the players. Identifying the ball is a complicated task. The ball can be in a great many different scenarios, on the ground, in the air, in a player’s hands, and out of bounds. Deep Learning also has to differentiate the players on the field from the referees, coaches, bench players and fans”
The verge gives an interesting explanation here:
“Pixellot, the company that makes the camera technology used by Inverness Caledonian Thistle, confirmed to The Verge that the problem was caused by visual similarities between the linesman’s head and the soccer ball. They noted that the angle of the camera didn’t help, as it made it seem as if the linesman’s head was inside the boundaries of the pitch, and the game ball itself was yellow, which added to the confusion”
Thoughts of the Bear
My reading here is that they use deep learning based object classification and apply it frame-by-frame.
This is a pretty standard approach today and seems to be working well enough to sell the solution – glitches like this will always occur but become less frequent with the amount of training data.
This technology is dependent on exhaustive and well labeled training data that cover all possible viewpoints and situations, it is bad at interpolating even slightly different pictures of “known” objects.
A well traveled avenue to introduce some artificial stupidity into your solutions 😊
I am not too deep into object detection, but the current state of the art describes a lot more sophistication in the handling of objects across frames.
https://viso.ai/deep-learning/object-detection/
(from the article above)
When I think about it in this scenario, a head may look like the ball but should move very differently in any usual game of football.
Again, this is a bit of speculation based on the high-level description.
But if I was tasked with fixing this product, I would ask A LOT of questions about this.
Additionally, I would consider a human in the loop approach, at least for a period of field introduction. It would need some mechanism by which the human can give live feedback to the tracking algorithm. If this does not work, I would introduce failure examples with a special label and hefty penalty for the next rounds of re-training.
So much for artificial stupidity for today.
But don´t worry, plenty more where it came from.
And if you got some nice story, please feel free to share.