Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Does that imply they retrained the foundation model from scratch? I thought changing the tokenization was something you couldn't really retrofit to an existing model. I mean sure they might have initialized the weights from the prior GPT-4 model but it'd still require a lot of retraining.


Yeah and they say as much in the blog.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: