What is the point of comparing performance of these tools to humans? Machines have been able to accomplish specific tasks better than humans since the industrial revolution. Yet we don't ascribe intelligence to a calculator.
None of these benchmarks prove these tools are intelligent, let alone generally intelligent. The hubris and grift are exhausting.
It can be reasonable to be skeptical that advances on benchmarks may be only weakly or even negatively correlated with advances on real-world tasks. I.e. a huge jump on benchmarks might not be perceptible to 99% of users doing 99% of tasks, or some users might even note degradation on specific tasks. This is especially the case when there is some reason to believe most benchmarks are being gamed.
Real-world use is what matters, in the end. I'd be surprised if a change this large doesn't translate to something noticeable in general, but the skepticism is not unreasonable here.
The GP comment is not skeptical of the jump in benchmark scores reported by one particular LLM. It's skeptical of machine intelligence in general, claims that there's no value in comparing their performances with those of human beings, and accuses those who disagree with this take of "hubris and grift". This has nothing to do with any form or reasonable skepticism.
I would suggest it is a phenomenon that is well studied, and has many forms. I guess mostly identify preservation. If you dislike AI from the start, it is generally a very strongly emotional view. I don't mean there is no good reason behind it, I mean, it is deeply rooted in your psyche, very emotional.
People are incredibly unlikely to change those sort of views, regardless of evidence. So you find this interesting outcome where they both viscerally hate AI, but also deny that it is in any way as good as people claim.
That won't change with evidence until it is literally impossible not to change.
> What evidence of intelligence would satisfy you?
That is a loaded question. It presumes that we can agree on what intelligence is, and that we can measure it in a reliable way. It is akin to asking an atheist the same about God. The burden of proof is on the claimer.
The reality is that we can argue about that until we're blue in the face, and get nowhere.
In this case it would be more productive to talk about the practical tasks a pattern matching and generation machine can do, rather than how good it is at some obscure puzzle. The fact that it's better than humans at solving some problems is not particularly surprising, since computers have been better than humans at many tasks for decades. This new technology gives them broader capabilities, but ascribing human qualities to it and calling it intelligence is nothing but a marketing tactic that's making some people very rich.
(Shrug) Unless and until you provide us with your own definition of intelligence, I'd say the marketing people are as entitled to their opinion as you are.
I would say that marketing people have a motivation to make exaggerated claims, while the rest of us are trying to just come up with a definition that makes sense and helps us understand the world.
I'll give you some examples. "Unlimited" now has limits on it. "Lifetime" means only for so many years. "Fully autonomous" now means with the help of humans on occasion. These are all definitions that have been distorted by marketers, which IMO is deceptive and immoral.
> Machines have been able to accomplish specific tasks...
Indeed, and the specific task machines are accomplishing now is intelligence. Not yet "better than human" (and certainly not better than every human) but getting closer.
> Indeed, and the specific task machines are accomplishing now is intelligence.
How so? This sentence, like most of this field, is making baseless claims that are more aspirational than true.
Maybe it would help if we could first agree on a definition of "intelligence", yet we don't have a reliable way of measuring that in living beings either.
If the people building and hyping this technology had any sense of modesty, they would present it as what it actually is: a large pattern matching and generation machine. This doesn't mean that this can't be very useful, perhaps generally so, but it's a huge stretch and an insult to living beings to call this intelligence.
But there's a great deal of money to be made on this idea we've been chasing for decades now, so here we are.
> Maybe it would help if we could first agree on a definition of "intelligence", yet we don't have a reliable way of measuring that in living beings either.
How about this specific definition of intelligence?
Solve any task provided as text or images.
AGI would be to achieve that faster than an average human.
I still can't understand why they should be faster. Humans have general intelligence, afaik. It doesn't matter if it's fast or slow. A machine able to do what the average human can do (intelligence-wise) but 100 times slower still has general intelligence. Since it's artificial, it's AGI.
None of these benchmarks prove these tools are intelligent, let alone generally intelligent. The hubris and grift are exhausting.