Google seems to be on a hot streak with their models, and, since they're playing from behind, I'd expect favorable pricing and terms. But, I don't know anyone who is using or talking about Gemini. All the chatter seems to be Anthropic vs. OpenAI.
because gemini, despite what stats say, still produces garbage once the problem gets harder. it nails it for lab conditions, but messy reality or creativity or even code quality is a far cry from opus or the latest gpt5.4 by a long shot. and always has been. its pretty good inside the GSuite because of integrations, but standalone its near worthless compared to even grok-code-fast which doesn't think much at all (but damn it is fast). At this point google keeps throwing noodlepots with AI against every wall in reach to see what sticks, which is more kind of desperation that still works to increase wall street highscores, but not exactly a streak or breakthrough. just rapid fire shotgun launches to see if anything sticks. No one serious talks Gemini because its not even worth considering still for real things outside shiny presentations and artificial benchmarks.
Gemini schools the other two when doing code reviews.
I used to think tokens are a commodity, but it’s becoming clear that the jagged frontier is different enough even for the easiest use case of SWE that there’s room for having two if not three providers of different foundational models. It isn’t a winner takes all, they’re all winning together. Cursor isn’t properly taking advantage of the situation yet.
My experience exactly. The more "real" the problems become, the more other models become unsuitable when compared to claude, with the sole exceptions being deepseek/kimi, which while speaking strictly w.r.t metrics and basic tasks are not better, they are more interesting and handle more odd and totally out of domain stuff better than the US models. An example being code i wrote for a hypercomplex sedenion based artififial neural network broke claude so bad it start saying it is chatgpt and cant evaluate/run code. similar experience for all US models, which are characterized by being extremely brittle at the fringes, though cladue least among them. Meanwhile chinese models are less capable for cookie cutter stuff but keep swinging when things get really weird and unusual. It's like US models optimize for the lowest minima acheivable, and god help you if distribution changes. Chinese models on the otgerhand seem to optimize for the flattest minima, giving poorer quality across the board but far more robust behaviour.
Google seems to be on a hot streak with their models, and, since they're playing from behind, I'd expect favorable pricing and terms. But, I don't know anyone who is using or talking about Gemini. All the chatter seems to be Anthropic vs. OpenAI.