Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

which bits of this do you think llm based agents can't do?


Not get stuck on an incorrect train of thought, not ignore core instructions in favour of training data like breaking naming conventions across sessions or long contexts, not confidently state "I completely understand the problem and this will definitely work this time" for the 5th time without actually checking. I could go on.


LLMs by their nature are not goal orientated (this is fundamental difference of reinforcement learning vs neural networks for example). So a human will have the, let's say, the ultimate goal of creating value with a web application they create ("save me time!"). The LLM has no concept of that. It's trying to complete a spec as best it can with no knowledge of the goal. Even if you tell it the goal it has no concept of the process to achieve or confirm the goal was attained - you have to tell it that.


The main thing they cannot do is be held accountable for any decisions, which makes them not trustworthy.


This is not correct. They can say "sorry" which makes them as accountable as ordinary developer.


That's not what accountability is


Accountability: "Something that SWE's run screaming from".

Example: "We should have professional accountability in software"

SWE: "This would bring about the end of the world!!!1!"


The economics of software development have lowered the bar for software engineers: there simply aren't enough people who are good at it (or even want to be), and the salaries are very high, so plenty of people who shouldn't be SWE's are.

I am a software engineer, and I would absolutely love to see more professional accountability in this field. Unfortunately, it would make the cost of software go up significantly (because many many people writing software would be ejected from the industry)


I've found recent versions of Claude and codex to be reluctant in this regard. They will recognise the problem they created a few minutes ago but often behave as if someone else did it. In many ways that's true though, I suppose.


Does it do this for really cut and dry problems? I’ve noticed that ChatGPT will put a lot of effort into (retroactively) “discovering” a basically-valid alternative interpretation of something it said previously, if you object on good grounds. Like it’s trying to evade admitting that it made a mistake, but also find some say to satisfy your objection. Fair enough, if slightly annoying.

But I have also caught it on straightforward matters of fact and it’ll apologize. Sometimes in an over the top fashion…


Ordinary developers get fired for poor performance *all the time*.


LLM based solutions don’t need to stay dry and warm at night, with a full belly, possibly with their sexual partner with whom they have a drive to procreate.


any of them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: