> I don’t think it’s hyperbolic to say that we may be only a single digit number of years away from the singularity.
We're back to singularity hype, but let's be real: benchmark gains are meaningless in the real world when the primary focus has shifted to gaming the metrics
I use agentic tools daily and SOTA models have certainly improved a lot in the last year. But still in a linear, "they don't light my repo on fire as often when they get a confusing compiler error" kind of way, not a "I would now trust Opus 4.6 to respond to every work email and hands-off manage my banking and investment portfolio" kind of way.
They're still afflicted by the same fundamental problems that hold LLMs back from being a truly autonomous "drop-in human replacement" that would enable an entire new world of use cases.
And finally live up to the hype/dreams many of us couldn't help but feeling was right around in the corner circa 2022/3 when things really started taking off.
Yet even Anthropic has shown the downsides to using them. I don't think it is a given that improvements in models scores and capabilities + being able to churn code as fast as we can will lead us to a singularity, we'll need more than that.
There’s about as much sense doing this as there is in putting datacenters in orbit, i.e. it isn’t impossible, but literally any other option is better.
We're back to singularity hype, but let's be real: benchmark gains are meaningless in the real world when the primary focus has shifted to gaming the metrics