> The objective is that at some point, there will be enough docs and improved models that the need for human reviews decreases while quality of code reaches a steady state that is more consistent than any human team of varying skill level could produce
There will never be a point when human reviews will be less needed; you're doomed to ship something horribly insecure at some point, if you ever remove them; please don't.
> it can help you massively improve your code cleanliness. All of the little nice-to-have features, the cleanups, the high unit test coverage, nagging bug fixes, etc., they’re all trivial to do now.
It can help if you write poor code without it, probably
High unit test coverage only means something if you carefully check those tests, and if everything was tested
> The only way Claude can help improve your code cleanliness is if you write poor code?
No? You assert that it writes better code than the average software developer?
> Code coverage means nothing if you didn't carefully check every test? "and if everything was tested" do you know what code coverage is?
Do you know?
Code coverage only tells what amount of the code gets *touched* by the tests.
To achieve code coverage it's enough to CALL the code, it doesn't tell you anything about the correctness of the tests: they could all end with a return true, and a code coverage tool would be perfectly happy.
So, yes, if you don't carefully check the test suite that the agent writes, it might well be worthless (or simply much less useful than you assume it to be, more realistically).
With "if everything was tested" I meant that you also need to check if the agent wrote all the tests that are needed, besides verifying that the ones it wrote are correct.
> You assert that it writes better code than the average software developer?
Absolutely. It contains a lot, if not majority, of all the code available at our hands right now and can reason, whatever it means for LLMs to reason anyway, about it. It absolutely demolishes average software developer and it’s not even close.
> To achieve code coverage it's enough to CALL the code, it doesn't tell you anything about the correctness of the tests: they could all end with a return true, and a code coverage tool would be perfectly happy.
> So, yes, if you don't carefully check the test suite that the agent writes, it might well be worthless (or simply much less useful than you assume it to be, more realistically).
That’s like saying that if you don’t check every line your coworker writes it becomes worthless.
A huge portion of the pro-AI crowd is motivated by irrational hype and delusion.
LLMs are not a tool like an editor or an IDE, they make up code in an unpredictable way; I can't see how anyone who enjoyed software development could like that.
Pretty much anyone who's not you, will make code in an unpredictable way. I review other people's code and I go 'really, you decided to do it that way?' quite often, especially with coders with less years of experience than me.
That's kind of how this is starting to feel to me, like I'm turning more into a project manager that has to review code from a team of juniors, when it comes to A.I. Although those juniors are now starting to show that they're more capable than even I am, especially when it comes to speed.
reply