These 8/16bit cores are rarely used for the main function of the device. They often exist as a general purpose controller to glue together random logic. For example, they might implement the logic that makes the LED be blue when the device is on, and flash blue when the power button is held for 3 seconds, and then trigger a power off when held for 10 seconds (or whatever).
They sometimes might even exist along side of a fully 32bit ARM micro.
Yes. If only because ARM cores mean you need to pay a licensing fee, often times, whereas the 8051, 6502, etc, tend to have significantly better royalty agreements (in my understanding). Also, like the other post says, these aren't used for huge computational uses, it's more for really really basic stuff.
You'd be amazed at the amount of chips that have an 8051 in them...
Well known cores that are proven reliable over several decades, very well documented and seemingly any bugs are known already. Also cheap