> they're almost always people who already had some pull toward software
I think this is probably true, and basically how I got into software myself.
I always dabbled in writing software and things for the web, but for some reason I never thought studying computer science would be any fun and that a career as a software developer sounded boring. But then I got an actual full time office job and oh boy, did my perspective on things change fast.
That first job did not have anything to do with writing software at all. But I saw people struggle with things that seemed to me trivial to automate, such as making annotations on paper bank statements and entering them into the system line-by-line. The bookkeeping system did support electronic bank statements, but lacked features to match certain descriptions to certain cost places. In the end it was indeed faster to go the paper route... It took me a couple of hours to write something that saved hours every week and that basically kick started my software career.
Would AI have made much of a difference here? Yes, in terms of getting to the correct solution faster, but probably not in terms of who would have done that. People would still come to the person who came up with the solution to ask for maintenance and new features.
I use voxtype on my Linux machine with parakeet. Super fast and regularly even gets the tech lingo correct. You can configure prompts and keywords to help with that as well.
Radeon R9700 with 32 GB VRAM is relatively affordable for the amount of RAM and with llama.cpp it runs fast enough for most things. These are workstation cards with blower fans and they are LOUD. Otherwise if you have the money to burn get a 5090 for speeeed and relatively low noise, especially if you limit power usage.
I have a pair of Radeon AI PRO R9700 with 32Gb, and so far they have been a pleasure to use. Drivers work out-of-the-box, and they are completely quiet when unused. They are capped at 300W power, so even at 100% utilization they are not too loud.
I was thinking about adding after-market liquid cooling for them, but they're fine without it.
Wouldn't be surprised if they slowly start quantizing their models over time. Makes it easier to scale and reduce operational cost. Also makes a new release have more impact as it will be more notably "better" than what you've been using the past couple of days/weeks.
I don't think so. There are other knobs they can tweak to reduce load that affect quality less than quantizing. Like trimming the conversation length without telling you, reducing reasoning effort, etc.
You said "like that", ok but there may be some truth to reduced model intelligence. Also how AWS deployed Anthropic models for Amazons Kiro feel much dumber than those controlled entirely by Anthropic. Can't be just me
Anthropic does not exactly act like they're constrained by infra costs in other areas, and noticeably degrading a product when you're in tight competition with 1 or 2 other players with similar products seems like a bad place to start.
I think people just notice the flaws in these models more the longer they use them. Aka the "honeymoon-hangover effect," a real pattern that has been shown in a variety of real world situations.
Open weights models such as GPT-OSS, Kimi K2.x are trained with 4 bit layers. So it wouldn't come as a surprise if the closed models do similar things. If I compare Kimi K2.5 and Opus 4.5 on openrouter, output tokens are about 8x more expensive for Opus, which might indicate Opus is much larger and doesn't quantize, but the claude subscription plans muddy the waters on price comparison a lot.
Oooff yes I think that is exactly the kind of shenanigans they might pull.
Ultimately I can understand if a new model is coming in without as much optimization then it'll add pressure to the older models achieving the same result.
Nice plausible deniability for a convenient double effect.
I haven't noticed much difference in Claude, but I swear gemini 3 pro preview was better in the first week or two and later started feeling like they quantized it down to hell.
You are equally understating past performance as you are overstating current performance.
One year ago I already ran qwen2.5-coder 7B locally for pretty decent autocomplete. And I still use it today as I haven't found anything better, having tried plenty of alternatives.
Today I let LLM agents write probably 60-80% of the code, but I frequently have to steer and correct it and that final 20% still takes 80% of the time.
These LLM benchmarks are like interviews for software engineers. They get drilled on advanced algorithms for distributed computing and they ace the questions. But then it turns out that the job is to add a button the user interface and it uses new tailwind classes instead of reusing the existing ones so it is just not quite right.
I use llama.vim with llama.cpp and the qwen2.5-coder 7B model. Easily fits on a 16 GB GPU and is fast even on a tiny RTX 2000 card with 70 watts of power. Quality of completions is good enough for me, if I want something more sophisticated I use something like Codex
I think this is probably true, and basically how I got into software myself.
I always dabbled in writing software and things for the web, but for some reason I never thought studying computer science would be any fun and that a career as a software developer sounded boring. But then I got an actual full time office job and oh boy, did my perspective on things change fast.
That first job did not have anything to do with writing software at all. But I saw people struggle with things that seemed to me trivial to automate, such as making annotations on paper bank statements and entering them into the system line-by-line. The bookkeeping system did support electronic bank statements, but lacked features to match certain descriptions to certain cost places. In the end it was indeed faster to go the paper route... It took me a couple of hours to write something that saved hours every week and that basically kick started my software career.
Would AI have made much of a difference here? Yes, in terms of getting to the correct solution faster, but probably not in terms of who would have done that. People would still come to the person who came up with the solution to ask for maintenance and new features.
reply