Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I believe this reduces the compute required, but still uses 8 bits per value, so it does not reduce the memory requirements required to run inference, so it doesn’t particularly make the models more accessible for inference. Is this storage method suitable for training? That could potentially be an interesting application.


It actually is about 0.5 bits less efficient per weight in terms of precision/range, something the paper never highlights.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: