Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't know if this is the one but something like this is clearly the future IMO. We need more levels of hierarchy to efficiently generalize to longer sequences with high level structure. Back when Byte Latent Transformers came out I thought extending the idea to more levels of hierarchy was the way to go, and this seems to be basically that?

Another article about H-Nets: https://main-horse.github.io/posts/hnet-inf/



Yes... This seems like a generalization of "large concept models" in a certain way




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: