Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think the point is that creating a collision when data + fileSize is hashed is much harder than just creating a collision when hashing data alone. Raymond explains it well: http://blogs.msdn.com/b/oldnewthing/archive/2004/05/19/13493...


...and if you took the same number of bits you were using to store the filesize, and instead stored that many bits of some independent secure hash, it'd be harder still.

Of course every extra bit that has to be matched makes collisions 'harder' but length bits are much weaker than other options, except insofar as they may already be available for other reasons.


And that was written before the md5 collisions were discovered. And no collision has yet been discovered for md5 for files of the same length, they are all extension attacks...


Absolutely false. Some MD5 collision-generators specifically find pairs of equally-lengthed inputs with the same hash. See for example hit #2 for [MD5 collisions]:

http://www.mscs.dal.ca/~selinger/md5collision/

'Extension attacks' are something else, which let you turn one collision into more, or create valid hashes for combinations of unknown text plus a chosen extension – not find an initial collision. See:

http://en.wikipedia.org/wiki/Merkle%E2%80%93Damg%C3%A5rd_con...

The 'length extension' property can be helpful, once you find a collision based on 'random' nonsense, in extending that into two documents that are each meaningful-but-different and still colliding, as was done in this 2005 MD5 collision demonstration:

http://replay.waybackmachine.org/20050612011328/http://www.c...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: