Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

https://arcprize.org/leaderboard

$13.62 per task - so we need another 5-10 years for the price to run this to become reasonable?

But the real question is if they just fit the model to the benchmark.

 help



Why 5-10 years?

At current rates, price per equivalent output is dropping at 99.9% over 5 years.

That's basically $0.01 in 5 years.

Does it really need to be that cheap to be worth it?

Keep in mind, $0.01 in 5 years is worth less than $0.01 today.


Wow that's incredible! Could you show your work?


What’s reasonable? It’s less than minimum hourly wage in some countries.

Burned in seconds.

Getting the work done faster for the same money doesn't make the work more expensive.

You could slow down the inference to make the task take longer, if $/sec matters.


You're right, but I don't think we're getting an hour's worth of work out of single prompts yet. Usually it's an hour's worth of work out of 10 prompts for iteration. Now that's a day's wage for an hour of work. I'm certain the crossover will come soon, but it doesn't feel there yet.

> but I don't think we're getting an hour's worth of work out of single prompts yet

But I don't think every developer is getting paid minimum wage either.

> Now that's a day's wage for an hour of work

For many developers in the US that can still be an hour's wage.


5-10 years? The human panel cost/task is $17 with 100% score. Deep Think is $13.62 with 84.6%. 20% discount for 15% lower score. Sorry, what am I missing?

A grad student hour is probably more expensive…

In my experience, a grad student hour is treated as free :(

You never applied for a grant, have you?

Grad students are incredibly cheap? In the UK for instance their stipend is £20,780 a year...

As it should be. They're a human!

That's not a long time in the grand scheme of things.

Speak for yourself. Five years is a long time to wait for my plans of world domination.

This concerns me actually. With enough people (n>=2) wanting to achieve world domination, we have a problem.

It’s not that I want to achieve world domination (imagine how much work that would be!), it’s just that it’s the inevitable path for AI and I’d rather it be me than then next shmuck with a Claude Max subscription.

Don't build your castle in someone else's kingdom.

I mean everyone with prompt access to the model says these things, but people like Sam and Elon say these things and mean it.

n = 2 is Pinky and the Brain.

I'm convinced that a substantial fraction of current tech CEOs were unwittingly programmed as children by that show.

Yes, you better hurry.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: