Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This.

Learn top down, not bottom up.

Watch maybe one or two short videos on back propagation. You don't need to be muddled in the theory and the math - you can become productive right away.

Once you start playing with pytorch and tensorflow models (train them yourself or do transfer learning), you'll start to develop an intuition for how the network graphs fit together. You'll also pick up tools like tensorboard.

Also, do transfer learning. It's so awesome to train on a publicly-available high quality and large data set, train for a lot of epochs for good problem domain fit, then swap out your own smaller data set. It's magical.

I have a feeling that ML in the future will be like engineering today. You can learn by doing and don't need a degree or formal background to be productive and eventually design your own networks.

I have no formal training (save one undergrad course that was way outdated in "general AI"), and I've designed my own TTS and voice conversion networks. I have real time models that run on the CPU for both of these, and as far as I know they're more performant than anything else out there (on CPU).

Eventually you might start reading papers. (You'll be productive long before you need to do this.) Most ML papers are open access, but review (broad survey) articles might need pirating. Thankfully there are websites that can help you get these. The papers aren't hard to read if you've spent some time playing with the networks they pertain to. Read the summary, abstract, and figures before diving into the paper. It may take a few reads and some googling.

You do not need to be a data scientist. Anybody can do it. That said, a good GPU will help a lot. I'm using two 1080Ti in SLI and they're pretty decent.



I feel somewhat similarly. If you want to learn ML from the “ground up” that means learning math (at least a few subjects) to the senior undergraduate level, some numerical methods, some probability and statistics, sprinklings of other stuff before you even get to the models. And it’s not even clear that stuff is important for ML in practice.

I’m someone who took all those math courses and some grad ML coursework. And what that means is that I’m qualified to try and hack together some specific research level things that a practitioner will be confused by, and then try to write a paper about it. It doesn’t mean I’m qualified to do what the practitioner does. Frankly I never ran my code on anything other than MNIST yet and don’t know the different architectures or applications well, since they’re not directly what I work on. They’re just different things, as I see it.


> I have no formal training (...) I have real time models that run on the CPU (..) and as far as I know they're more performant than anything else out there > You do not need to be a data scientist. Anybody can do it. That said, a good GPU will help a lot. I'm using two 1080Ti in SLI and they're pretty decent

An alternative is that, by not knowing what you are doing, you may not see all the options that exist -- and when you hit a problem too hard, you just throw more hardware (GPUs) at it.

This is not to say it is not sometimes a valid approach, but I'd be wary of someone who say hasn't had any formal training in C, and says that his stuff is more performant that anything out there- just because lack of training causes not knowing stuff that already exists.


> An alternative is that, by not knowing what you are doing, you may not see all the options that exist -- and when you hit a problem too hard, you just throw more hardware (GPUs) at it.

Maybe some will. I just explained that I'm running my models on CPUs, so I'm actually developing sparse and efficient resource constrained models that evaluate quickly.

I've been working with libtorch's JIT engine in Rust (tch.rs bindings).

I'm currently trying to adapt Melgan to the Voice Conversion problem domain so I can get real time, high-fidelity VC without using a classical vocoder. WORLD works great and quickly, but it's a poor substitute for the real thing as it only maps the fundamental frequency, spectral envelope, and aperiodicity. Melgan is super high quality and faaast.


Are you working on VC (input: speech of one speaker, output: the same spoken content, but sounds like another speaker) or speaker-adaptive speech synthesis (input: text, output: speech)?

Also check out ParallelWaveGAN, another high-quality and very fast (on CPU) neural vocoder.


> You do not need to be a data scientist. Anybody can do it. That said, a good GPU will help a lot. I'm using two 1080Ti in SLI and they're pretty decent.

You can also use Google colab for a free GPU/TPU




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: