Then you don't understand Machine Learning in any real way. Literally the 3rd or...

fourthrigbt · 2026-02-16T06:44:17 1771224257

Doesn’t sound like you paid all that much attention when learning ML. The curse of dimensionality doesn’t say that every problem has some ideal model size, it says that the amount of data needed to train scales with the size of the feature space. So if you take an LLM, you can make the network much larger but if you don’t increase the size of the input token vocabulary you aren’t even subject to the curse of dimensionality. Beyond that, there’s a principle in ML theory that says larger models are almost always better because the number of params in the model is the dimensionality of the space in which you’re running gradient descent and with every added dimension, local optima become rarer.

rndphs · 2026-02-16T05:51:45 1771221105

> Literally the 3rd or 4th thing you learn about ML is that for any given problem, there is an ideal model size.

From my understanding this is now outdated. The deep double descent research showed that although past a certain point performance drops as you increase model size, if you keep increasing it there is another threshold where it paradoxically starts improving again. From that point onwards increasing the parameter count only further improves performance.

hunterpayne · 2026-02-16T06:41:57 1771224117

That isn't what that research says at all. What that research says is that running the same training data through multiple times improves training. There is still an ideal model size though, it is just impacted by the total volume of training data.

rndphs · 2026-02-16T07:09:07 1771225747

https://arxiv.org/pdf/1912.02292 "We show that a variety of modern deep learning tasks exhibit a "double-descent" phenomenon where, as we increase model size, performance first gets worse and then gets better." That is the first sentence of the abstract. The first graph shown in the paper backs it up.

Looking into it further, it seems that typical LLMs are in the first descent regime anyway though so my original point is not too relevant for them anyway it seems. Also it looks like the second descent region doesn't always reach a lower loss than the first, it appears to depend on other factors as well.

Lerc · 2026-02-16T03:27:21 1771212441

Um, what? Are you interpreting scaling to mean adding parameters and nothing else?

I'm not entirely sure where you get your confidence that we've past the ideal model size from, but at least that's a clear prediction so you should be able to tell if and when you are proven wrong.

Just for the record, do you care to put an actual number on something we won't go past?

[edit] Vibe check on user comes out as

    Contrarian 45%
    Pedantic 35%
    Skeptical 15%
    Direct  5%

That's got to be some sort of record.

kens · 2026-02-16T04:41:11 1771216871

Is there a tool or something that gives this vibe check? (Serious question)

greedo · 2026-02-16T04:41:38 1771216898

How are you calculating that? Also, my 1000 foot view would see that "rating" as something most HN commenters would match.

Lerc · 2026-02-16T06:33:29 1771223609

It's comparatively few really

for instance yours comes out as

Analytical 45%, Cynical 30%, Pedantic 15%, Melancholic 10%

and mine is

Philosophical 35%, Hardware-Obsessed 25%, Analytically Pedantic 20%, Retro-Nostalgic 15%, Anti-Ad Skeptic 5%

You should consider gathering all of your analysis and pedantry into one easy to manage neurosis.

It's from https://hn-wrapped.kadoa.com

lelanthran · 2026-02-16T07:53:03 1771228383

> How are you calculating that?

He's using a tool that was shared on HN some time back that takes a username and generates those states from the posts made.

When I last checked, of over 10k posts, it only uses a few dozen to calculate that score, so it is about as reliable as dowsing.

> Also, my 1000 foot view would see that "rating" as something most HN commenters would match.

Probably. Why else join a discussion if you're going to be a yes-man to every comment?

Lerc · 2026-02-16T21:12:34 1771276354

>When I last checked, of over 10k posts, it only uses a few dozen to calculate that score, so it is about as reliable as dowsing.

A few samples are sufficient when the signal is strong enough. The time spent pie chart is definitely more what the user has been doing recently.

Overall, not everybody comes out the same, Pedantry is strong which I'm not really surprised about for a forum like this, but there are definitely personality traits of some users of sufficient magnitude that you can guess what the result will be.

Looking at the last 10 users who posted comments on HN are

Contrarian45%, Didactic25%, Skeptical15%, Analytical10%, Adversarial5%

Skeptical45%, Analytical30%, Contrarian15%, Helpful10%

Heretical45%, Low-Level Pedantic25%, Chaotic Helpful15%, Hardware-Jaded15%

Contrarian45%, Pedantic30%, Skeptical15%, Helpful10%

Helpful75%, Nostalgic15%, Appreciative5%, Skeptical5%

Defensive45%, Intellectual Flexing25%, Techno-Optimist20%, Exasperated10%

Skeptical45%, Pragmatic25%, Nostalgic20%, Helpful10%

Pedantic45%, Helpful25%, Techno-skeptic20%, Nostalgic10%

Pragmatic40%, Nostalgic25%, Opinionated20%, Visionary15%

Technically Precise45%, Disillusioned25%, Deeply Empathetic15%, Anti-AI Crusader15%