Apple would need to stick an m4 in the next iPhone to even hope to run something like this and I bet that GPT4o would run either slowly, poorly, or not at all on a top spec m4.
Of course GPT 4, or even 3, are impossible to run on any consumer product. As far as I know it's an ensemble of several models which are huge by themselves, with enormous hardware requirements.
But there's a lot of smaller LLMs, and my point is that these models can already run in mobile phones.
Where do you draw the line? GPT2 was introduced as a LLM, and you can easily run it on more limited devices than a recent iPhone. Did it stop being an LLM when bigger models were released? Is llama 7B an LLM or an "SLM"?
Relatively speaking. It's like the definition of a supercomputer 30 years ago is a cheap Android phone in your pocket today.
You can certainly run a transformer model or any other neural network based model on an iPhone. Siri is probably some kind of neural network. But obviously a model running on device is nowhere near comparable to the current state of the art LLM's. Can't fit a $40k GPU in your pocket (yet).
A transformer running on an iPhone would be roughly 2 orders of magnitude smaller than the state of the art LLM (GPT4 with a trillion parameters)