Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
andrepd
14 days ago
|
parent
|
context
|
favorite
| on:
We tasked Opus 4.6 using agent teams to build a C ...
This chatbot has several C compilers in its training data. How is this possibly a useful benchmark for anything? LLMs routinely output code verbatim or modulo trivial changes as their own (very useful for license-laundering too).
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: