They are using the current models to help develop even smarter models. Each gene...

lm28469 · 2026-02-12T19:54:25 1770926065

I must be holding these things wrong because I'm not seeing any of these God like superpowers everyone seem to enjoy.

brokencode · 2026-02-12T20:32:05 1770928325

Who said they’re godlike today?

And yes, you are probably using them wrong if you don’t find them useful or don’t see the rapid improvement.

lm28469 · 2026-02-12T20:43:02 1770928982

Let's come back in 12 months and discuss your singularity then. Meanwhile I spent like $30 on a few models as a test yesterday, none of them could tell me why my goroutine system was failing, even though it was painfully obvious (I purposefully added one too many wg.Done), gemini, codex, minimax 2.5, they all shat the bed on a very obvious problem but I am to believe they're 98% conscious and better at logic and math than 99% of the population.

Every new model release neckbeards come out of the basements to tell us the singularity will be there in two more weeks

BeetleB · 2026-02-12T21:25:52 1770931552

On the flip side, twice I put about 800K tokens of code into Gemini and asked it to find why my code was misbehaving, and it found it.

The logic related to the bug wasn't all contained in one file, but across several files.

This was Gemini 2.5 Pro. A whole generation old.

Izikiel43 · 2026-02-12T21:22:32 1770931352

Out of curiosity, did you give a test for them to validate the code?

I had a test failing because I introduced a silly comparison bug (> instead of <), and claude 4.6 opus figured out it wasn't the test the problem, but the code and fixed the bug (which I had missed).

lm28469 · 2026-02-12T21:34:31 1770932071

There was a test and a very useful golang error that literally explain what was wrong. The model tried implementing a solution, failed and when I pointed out the error most of them just rolled back the "solution"

frde_me · 2026-02-13T01:53:14 1770947594

What exact models were you using? And with what settings? 4.6 / 5.3 codex both with thinking / high modes?

lm28469 · 2026-02-13T09:28:00 1770974880

minimax 2.5, kimi k2.5, codex 5.2, gemini 3 flash and pro, glm 4.7, devstral2 123b, etc.

Izikiel43 · 2026-02-12T22:10:58 1770934258

Ok, thanks for the info

brokencode · 2026-02-12T20:52:19 1770929539

You are fighting straw men here. Any further discussion would be pointless.

lm28469 · 2026-02-12T21:39:02 1770932342

Of course, n-1 wasn't good enough but n+1 will be singularity, just two more weeks my dudes, two more week... rinse and repeat ad infinitum

brokencode · 2026-02-12T21:52:01 1770933121

Like I said, pointless strawmanning.

You’ve once again made up a claim of “two more weeks” to argue against even though it’s not something anybody here has claimed.

If you feel the need to make an argument against claims that exist only in your head, maybe you can also keep the argument only in your head too?

tom_ · 2026-02-13T01:52:27 1770947547

It's presumably a reference to this saying: https://www.urbandictionary.com/define.php?term=2%20more%20w...

virgildotcodes · 2026-02-13T01:34:27 1770946467

Mind sharing the file?

Also, did you use Codex 5.3 Xhigh through the Codex CLI or Codex App?

goodmythical · 2026-02-13T18:22:39 1771006959

I think you're being awfully generous to the average human.

Consider that a nonzero percent of otherwise competent adults can't write in their native language.

Consider that some tens of percentage of people wouldn't have the foggiest idea of how to calculate a square root let alone a cube.

Consider that well less than half of the population has ever seen code let alone produced functioning code.

The average adult is strikingly incapable of things that the average commenter here would consider basic skills.

woah · 2026-02-12T21:12:22 1770930742

Post the file here

antonvs · 2026-02-13T01:38:48 1770946728

> I purposefully added one too many wg.Done

What do you believe this shows? Sometimes I have difficulty finding bugs in other people's code when they do things in ways I would never use. I can rewrite their code so it works, but I can't necessarily quickly identify the specific bug.

Expecting a model to be perfect on every problem isn't reasonable. No known entity is able to do that. AIs aren't supposed to be gods.

(Well not yet anyway - there is as yet insufficient data for a meaningful answer.)

laurentiurad · 2026-02-13T13:15:22 1770988522

When companies claim that AI writes 90% of their code you can expect that such a system can find obvious issues. Expectations are really high when you see statements such as the ones coming from the CEOs of the AI labs. When those expectations fall short, it's expected to see such reactions. It's the same proportionality on both sides.

SpicyLemonZest · 2026-02-13T01:36:09 1770946569

It's hard to evaluate "logic" and "math", since they're made up of many largely disparate things. But I think modern AI models are clearly better at coding, for example, than 99% of the population. If you asked 100 people at your local grocery store why your goroutine system was failing, do you think multiple of them would know the answer?

logicprog · 2026-02-12T20:50:18 1770929418

Meanwhile I've been using Kimi K2T and K2.5 to work in Go with a fair amount of concurrency and it's been able to write concurrent Go code and debug issues with goroutines equal to, and much more complex then, your issue, involving race conditions and more, just fine.

Projects:

https://github.com/alexispurslane/oxen

https://github.com/alexispurslane/org-lsp

(Note that org-lsp has a much improved version of the same indexer as oxen; the first was purely my design, the second I decided to listen to K2.5 more and it found a bunch of potential race conditions and fixed them)

shrug

viking123 · 2026-02-13T09:47:14 1770976034

It's basically bunch of people who see themselves as too smart to believe in God, instead they have just replaced it with AI and Singularity and attribute similar stuff to it eg. eternal life which is just heaven in religion. Amodei was hawking doubling of human lifespan to a bunch of boomers not too long ago. Ponce de León also went to search for the fountain of youth. It's a very common theme across human history. AI is just the new iteration where they mirror all their wishes and hopes.

brokencode · 2026-02-13T16:45:43 1771001143

You realize that science and technology does in fact produce medical breakthroughs that cure disease, right?

On the other hand, prayer doesn’t heal anybody and there’s no proof of supernatural beings.

viking123 · 2026-02-13T20:18:28 1771013908

The boomers he was talking to will be long underground before we will have any major cures for the diseases they will die from lmao. Maybe in 200 years?

Btw, so will you and I most likely.

mrandish · 2026-02-13T00:32:41 1770942761

> using the current models to help develop even smarter models.

That statement is plausible. However, extrapolating that to assert all the very different things which must be true to enable any form of 'singularity' would be a profound category error. There are many ways in which your first two sentences can be entirely true, while your third sentence requires a bunch of fundamental and extraordinary things to be true for which there is currently zero evidence.

Things like LLMs improving themselves in meaningful and novel ways and then iterating that self-improvement over multiple unattended generations in exponential runaway positive feedback loops resulting in tangible, real-world utility. All the impressive and rapid achievements in LLMs to date can still be true while major elements required for Foom-ish exponential take-off are still missing.

sekai · 2026-02-12T21:29:47 1770931787

> I don’t think it’s hyperbolic to say that we may be only a single digit number of years away from the singularity.

We're back to singularity hype, but let's be real: benchmark gains are meaningless in the real world when the primary focus has shifted to gaming the metrics

brokencode · 2026-02-12T21:44:17 1770932657

Ok, here I am living in the real world finding these models have advanced incredibly over the past year for coding.

Benchmaxxing exists, but that’s not the only data point. It’s pretty clear that models are improving quickly in many domains in real world usage.

toraway · 2026-02-13T04:24:44 1770956684

I use agentic tools daily and SOTA models have certainly improved a lot in the last year. But still in a linear, "they don't light my repo on fire as often when they get a confusing compiler error" kind of way, not a "I would now trust Opus 4.6 to respond to every work email and hands-off manage my banking and investment portfolio" kind of way.

They're still afflicted by the same fundamental problems that hold LLMs back from being a truly autonomous "drop-in human replacement" that would enable an entire new world of use cases.

And finally live up to the hype/dreams many of us couldn't help but feeling was right around in the corner circa 2022/3 when things really started taking off.

mrbungie · 2026-02-12T23:22:59 1770938579

Yet even Anthropic has shown the downsides to using them. I don't think it is a given that improvements in models scores and capabilities + being able to churn code as fast as we can will lead us to a singularity, we'll need more than that.

Freedom2 · 2026-02-13T04:27:30 1770956850

I agree completely. I think we're in alignment with Elon Musk who says that AI will bypass coding entirely and create the binary directly.

It's going to be an exciting year.

baq · 2026-02-13T06:25:57 1770963957

There’s about as much sense doing this as there is in putting datacenters in orbit, i.e. it isn’t impossible, but literally any other option is better.