Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This feels to me like the most useless definition of "AI safety" in practice, and it's astonishing to see just how much R&D efforts are spent on it.

Thankfully the open-weights models are trivially jailbreakable regardless of any baked-in guardrails simply because one controls the generation loop and can make the model not refuse.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: