Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"We ran our captcha-breaking system against 2,235 captchas, and obtained a 70.78% accuracy"

That's more impressive than it sounds. I'm pretty sure 70.78% is more accurate than I am with reCAPTCHA manually. A lot of the captcha's presented are very fuzzy, or have ambiguous questions, etc.



>I'm pretty sure 70.78% is more accurate than I am with reCAPTCHA manually.

exactly. Many reCAPTCHA are beyond simple recognition and make me start guessing. I expect we'll see new type of reCAPTCHA - you're a human if you make mistake and robot if correct answer is typed in :) Similar to those 1x1 images not visible to humans, yet visible to the robots.


Indeed. It's good that these guys are white-hats because they could have made a killing selling to spammers (for as long as they could go undetected, which could've been a while).


Going rate for humans breaking captchas is like $1/1000 captchas solved. A fully automated service could make some money, sure, but not exactly a killing.


From paper, "Assuming a selling price of $2 per 1,000 solved captchas, our token harvesting attack could accrue $104 - $110 daily, per host (i.e., IP address). By leveraging proxy services and running multiple attacks in parallel, this amount could be significantly higher for a single machine."


Yeah; that's the problem...you can make, with tons of effort, a fairly decent system to tell robots and humans apart, but it's much much more difficult to tell humans trying to do the thing directly from those solving the challenge remotely. It's an arms race of economics; the challenge has to be difficult enough that it slows down humans in sweatshops to the point where it makes the whole enterprise not worthwhile for the abusers while not pissing off your actual users. It also has to resist automated malicious use. Quite a tall order.

The best I've seen are the "which of these photos show mountains"-type challenges. I'd imagine that solving 5 rounds of those would take too long to make it worthwhile for spammers, but I'd also imagine lots of legitimate users getting irked at going through that to fill out a form.


"This work was supported by the NSF under grant CNS-13-18415"

I think that means you might be able to get source code via FOIA or similar :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: