Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

there is so much material on deep learning basics these days that I think we can finally skip reintroducing gradient descent in every tutorial, can't we?


The idea of "find in which direction function decreases most quickly and go that direction" is really deep, and its implementation via this cutting-edge mathematical concept of "gradient" also deserves a whole section as well.


It's both really shallow and really deep.

On one hand, you can explain it to a 5-year-old: Go in the direction which improves things.

On the other hand, we have more than a half-century of research on sophisticated mathematical methods for doing it well.

The latter isn't really helpful for beginners, and the former is easy to explain. You can't use sophisticated algorithms in either case, for beginners, so you can go with something as dumb as tweak in all directions, and go where it improves most. It will work fine for dummy examples.


Look no further: https://jax.readthedocs.io/en/latest/autodidax.html

"Autodidax: JAX core from scratch" walks you through it in detail.


Any favorites you can share?


I'd recommend this short book: https://www.amazon.com/gp/aw/d/B01EER4Z4G/ (Make your own neural network by Tariq Rashid)

This one doesn't use any frameworks. The next book by the author (on GANs) uses PyTorch. The math is relatively easy to follow I think.

Andrew Ng's courses on Coursera can be viewed for free and have sightly more rigorous math, but still okay.

You don't have to understand every mathematical detail, same as you don't need every mathematical detail for 3d graphics. But knowing the basics should be good I think!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: