I was pretty gung ho about getting into ML two years ago and put a lot of time into online courses like Ng's, books, and ground-up implementations of a lot of the common algorithms. I enjoyed it, but after a while it became clear to me that a lot of this stuff is better described as applied statistics.
And this can be powerful, of course, but it doesn't really have much of the magic of AI.
"a lot of this stuff is better described as applied statistics"
This is a key insight. Bravo!
To generalize a bit more, most of ML is applied mathematics. Getting a good grounding in the underlying math is the most illuminating step to learning ML (spoken as someone who wasted a lot of time doing other things thanks to an irrational fear of learning mathematics and am still bad at it)
Deep math/stat understanding combined with the engineering bits(like programming, cleaning the data, running clusters) and the communication bits, (like visualization) brings you to (what should be) 'data science' (imvvho ymmv etc etc).
I am still not sure one person can pull it all off-it probably needs a solid team of specialists. But hey 'data scientist' is a hot job description, and so you can't blame people who know bits and pieces (sometimes very small bits and pieces ;) ) calling themselves 'data scientists' or whatever. "Machine Learning for Hackers" and all that jazz. We've seen all this before with "HTML coders" from the nineties.
Work as a search quality engineer at Google and you do pretty much all of that.
Except for running the clusters[1], I've done pretty much all of those steps myself. I started with a nice statistical idea, built some simple models, played with feature selection and learning algorithms, built model viewers, built classifiers, validated classifiers, built demos, validated demos, built a production implementation[2], optimized the production implementation to make it small/fast enough, and finally launched a big search quality improvement.
[1] I certainly write distributed code that runs on them, but maintaining the DCs definitely isn't part of my job description.
[2] Validation of the final quality in prod is actually someone else's job, not because I couldn't do it, but you might not want me to tell you how good my stuff is, cause you know, I might be biased.
Right, it's less sexy than people think. That was my reaction when taking an Artificial Intelligence class and a Machine Learning class over 10 years ago as an undergrad. I was like, "these are unprincipled hacks". I liked graphics better. There were actual algorithms.
But in college you never get to apply them to real problems. I think if you apply them to real problems you'll have the revelation. Especially when you try other approaches first. But actually applying them requires domain knowledge, data cleaning skills, and programming skills beyond what many people have (certainly myself as an undergraduate).
Not disagreeing with your second paragraph, but just wanted to point out that Machine Learning has matured way beyond the "unprincipled hacks" phase, and as was correctly pointed out above, can be seen as a direction in applied statistics. If you look at a modern course in multivariate statistics, there's a significant overlap with ML (http://goo.gl/GTDUC).
I think people's expectation of a new scientific or engineering discipline to be "sexy" or "magic" is simply a sign of widespread ignorance of the field, so it's a good thing when something becomes less "magic" and more "real". I bet airplanes were "magic" until we learned how to fly them consistently and safely :)
Machine learning is full of hacks to connect theory to producing results given constraints. That's not too different from hacks that are done for the sake of performance in graphics rendering.
And this can be powerful, of course, but it doesn't really have much of the magic of AI.