Work as a search quality engineer at Google and you do pretty much all of that.
Except for running the clusters[1], I've done pretty much all of those steps myself. I started with a nice statistical idea, built some simple models, played with feature selection and learning algorithms, built model viewers, built classifiers, validated classifiers, built demos, validated demos, built a production implementation[2], optimized the production implementation to make it small/fast enough, and finally launched a big search quality improvement.
[1] I certainly write distributed code that runs on them, but maintaining the DCs definitely isn't part of my job description.
[2] Validation of the final quality in prod is actually someone else's job, not because I couldn't do it, but you might not want me to tell you how good my stuff is, cause you know, I might be biased.
Except for running the clusters[1], I've done pretty much all of those steps myself. I started with a nice statistical idea, built some simple models, played with feature selection and learning algorithms, built model viewers, built classifiers, validated classifiers, built demos, validated demos, built a production implementation[2], optimized the production implementation to make it small/fast enough, and finally launched a big search quality improvement.
[1] I certainly write distributed code that runs on them, but maintaining the DCs definitely isn't part of my job description.
[2] Validation of the final quality in prod is actually someone else's job, not because I couldn't do it, but you might not want me to tell you how good my stuff is, cause you know, I might be biased.