And we're still in the expansion phase, so LLM life is actually good... for now.

Aerroon · 2026-02-17T11:15:15 1771326915

It's not going to get worse than now though. Open models like GLM 5 are very good. Even if companies decide to crank up the costs, the current open models will still be available. They will likely get cheaper to run over time as well (better hardware).

RGamma · 2026-02-17T11:36:56 1771328216

That's good to hear. I'm not really up-to-date on the open models, but they will become essential, I'm sure.

jplusequalt · 2026-02-17T14:12:34 1771337554

>Open models like GLM 5 are very good. Even if companies decide to crank up the costs, the current open models will still be available.

https://apxml.com/models/glm-5

To run GLM-5 you need access to many, many consumer grade GPUs, or multiple data center level GPUs.

>They will likely get cheaper to run over time as well (better hardware).

Unless they magically solve the problem of chip scarcity, I don't see this happening. VRAM is king, and to have more of it you have to pay a lot more. Let's use the RTX 3090 as an example. This card is ~6 years old now, yet it still runs you around $1.3k. If you wanted to run GLM-5 I4 quantization (the lowest listed in the link above) with a 32k context window, you would need *32 RTX 3090's*. That's $42k dollars you'd be spending on obsolete silicon. If you wanted to run this on newer hardware, you could reasonable expect to multiply that number by 2.

RGamma · 2026-02-17T14:27:06 1771338426

I mean it would make sense to see this as a hardware investment into a virtual employee, that you actually control (or rent from someone who makes this possible for you), not as private assistant. Ballparking your numbers, we would need at least an order of magnitude price-performance improvement for that I think.

Also, how much bang for the buck do those 3090s actually give you compared to enterprise-grade products?