My opinion is that the theory starts to make sense after you know how to use the models and have seen different models produce different results.
Very few people can read about bias variance trade off and in the course of using a model, understand how to take that concept and directly apply it to the problem they are solving. In retrospect, they can look back and understand the outcomes. Also, most theory is useless in the application of ML, and only usefull in the active research of new machine learning methods and paradigms. Courses make the mistake of mixing in that useless information.
The same thing is true of the million different optimizers for neural networks. Why different ones work better in different cases is something you would learn when trying to squeeze out performance on a neural network. Who here is intelligent enough to read a bunch about SGD and optimization theory (Adam etc), understand the implications, and then use different optimizers in different situations? No one.
I'm much better off having a mediocre NN, googling, "How to improve my VGG image model accuracy", and then finding out that I should tweak learning rates. Then I google learning rate, read a bit, try it on my model. Rinse and repeat.
What usually happens is that people get something working, think they now know ML, but don't even generally know enough to understand the things they did wrong, and never end up getting to the theory.
The best approach is to learn both concurrently. Learn some theory, apply it and understand that applications including pitfalls, then learn a bit more and repeat. Incremental learning with a solid base. It's fun to hate on academia but this is how experts with deep knowledge of a domain get to where they are.
Sadly, you are 100% correct. I see the same problems over and over in newly published AI research papers.
That said, playing for 1-2 weeks might be a good start towards getting motivated for learning the difficult and dry theory needed to excel in this field.
I personally started with Kaggle competitions and lots of googling (duckduckgoing right?), but quite quickly hit the wall of not understanding, I felt like a mindless creature who makes a decision based on couple of guides out there. Watching lectures from Andrew Ng, reading some books helped a lot, but I can't see a reason why one doesn't wanna start with theory. It's no gold and glitter, and no one promised you that, unless you're really want to delegate your work to AutoML
I guess his point is to tackle it from a top-down approach. For me, that's how I am breaking ground in my ML study. I tried Andrew Ng's course, I didn't understand a thing.
Then I tried Kaggle's mini-course. It kickstarted me into ML and motivated me to learn the theory as I go. For example, when I got to apply Random Forest Regressor, I went to Wikipedia and tried to read on it. Got some idea. And the progress is good.
Maybe for some of us, I think top-down is motivating and makes the learning process enjoyable.
Same here. I tried Andrew Ng's course a few times ever since it launched a few years back but I could only get through half of it. Fast ai makes more sense to me and I've picked up a decent amount of concepts where I can now go back and feel confident enough to tackle theory.
The danger is throwing something into production without understanding bias and variance, overfitting (or other important concept) with potentially disastrous results.
One cannot do ML without some basic theoretical knowledge of Statistics and Probability. This gives you the What and the Why behind everything. GI-GO is more true of ML than other disciplines. The techniques used are so opaque that if you don't know what you are doing, you can never trust the results.
One thing that made the Uber fatality possible was their over-confidence in their AI, which they apparently did not fully understand. They considered it unnecessary and disabled the car-integrated emergency collision breake system ...
Very few people can read about bias variance trade off and in the course of using a model, understand how to take that concept and directly apply it to the problem they are solving. In retrospect, they can look back and understand the outcomes. Also, most theory is useless in the application of ML, and only usefull in the active research of new machine learning methods and paradigms. Courses make the mistake of mixing in that useless information.
The same thing is true of the million different optimizers for neural networks. Why different ones work better in different cases is something you would learn when trying to squeeze out performance on a neural network. Who here is intelligent enough to read a bunch about SGD and optimization theory (Adam etc), understand the implications, and then use different optimizers in different situations? No one.
I'm much better off having a mediocre NN, googling, "How to improve my VGG image model accuracy", and then finding out that I should tweak learning rates. Then I google learning rate, read a bit, try it on my model. Rinse and repeat.