there is so much material on deep learning basics these days that I think we can finally skip reintroducing gradient descent in every tutorial, can't we?
The idea of "find in which direction function decreases most quickly and go that direction" is really deep, and its implementation via this cutting-edge mathematical concept of "gradient" also deserves a whole section as well.
On one hand, you can explain it to a 5-year-old: Go in the direction which improves things.
On the other hand, we have more than a half-century of research on sophisticated mathematical methods for doing it well.
The latter isn't really helpful for beginners, and the former is easy to explain. You can't use sophisticated algorithms in either case, for beginners, so you can go with something as dumb as tweak in all directions, and go where it improves most. It will work fine for dummy examples.
This one doesn't use any frameworks. The next book by the author (on GANs) uses PyTorch. The math is relatively easy to follow I think.
Andrew Ng's courses on Coursera can be viewed for free and have sightly more rigorous math, but still okay.
You don't have to understand every mathematical detail, same as you don't need every mathematical detail for 3d graphics. But knowing the basics should be good I think!