-
Notifications
You must be signed in to change notification settings - Fork 2
MD5 Toolkit by J. Black, M. Cochran and T. Highland
sirdarckcat/md5toolkit
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
MD5 Toolkit =========== This package contains code to produce MD5 collisions. The core of the toolkit are the two programs, 'block1' and 'block2'. block1 produces a pair M_1 and M'_1 of 512-bit blocks that satisfy the Wang-differential. It also produces the MD5 chaining value for M_1. block2 needs only the M_1 chaining value (the M'_1 chaining value can be determined from it), and it then outputs a pair M_2 and M'_2. At the end, we have that M_1 || M_2 collides with M'_1 || M'_2 under MD5. Auxiliary Programs ================== Both block1 and block2 output their results in ASCII. The program 'makeblocks' converts the output from these programs into two binary files so that MD5 can be run on them ('md5sum' is the typical program used for Linux installations). A shell script 'makebins.sh' is also provided in the makeblocks/ directory, which does all the steps needed to produce a sample collision. makebins.sh first runs block1, saving its output in a file 'block1.out'. Then it runs 'block2' and saves its output in 'block2.out'. Appending these outputs into a single file, it hands them to 'makeblocks' along with output filenames b1.bin and b2.bin. b1.bin and b2.bin are distinct files, each 1024-bits long, which collide under MD5. md5sum is run on each of these files and the user may verify the digests are the same. Building the Code ================= sh build.sh builds all the code. Efficiency ========== The speed of the code, as usual, depends on several factors: the hardware used, the quality of the compiler, and how well the code has been tuned. (This under the understanding that the basic algorithm remains the same.) We have tested the code on a Pentium4 (Xeon) with g++ 3.4.4. (Other architectures can be specified in the build.sh file.) The code was also tested with the Intel icc compiler, and it runs significantly faster, but this compiler is not as widely available, so we've stuck to g++ for our Makefiles. The code has not yet been tuned anywhere near its potential. Currently, in the environment cited above, the block1 code runs typically from 8 to 50 minutes before producing a collision. The block2 code runs about 3 seconds to 5 minutes, typically. Running 100 trials of each program, we've measured 16 mins as the average time for block1 and 2 mins for the average time for block2 (with large variations, however). Therefore we expect to see collisions in under 20 mins, though sometimes much slower or much faster. Tuning the code makes things run much faster. Using the faster code from http://www.stachliu.com/collisions.html for the block1 algorithm allows us to produce collisions in about 11 minutes. Efforts are underway to tune the code for both blocks; preliminary results show we can expect collisions in about 5 mins. Wang's original code ran in about an hour on an IBM P690 supercomputer. The Stach-Liu code cited above runs in about 45 minutes on a Pentium4, however their code sometimes produces block1 pairs that have no block2 solutions (our code never does this). Applications ============ We designed the toolkit so that it would be easy to generates various flavors of collisions. For example, many collisions where the first blocks are the same between a given pair, but the second blocks vary. (We can get lots of these fast because our block2 code is quite fast even before we have tuned it much.) Also, we can generate collisions for arbitrary IVs. This is useful for attacks like the Daum-Lucks attack (http://www.cits.rub.de/MD5Collisions). As an application of their idea we produced two binaries that have the same MD5 digest but where binary1 prints "hello world" and binary2 prints "I am erasing your hard disk" (even though are program really doesn't erase your hard disk). This is done by having the program check to see whether msg1 is in a given array, or msg2 is (where msg1 and msg2 are 1024-bit values that collide under MD5 with the chaining value resulting from hashing the binary to the point where the msg1 or msg2 value occurs).
About
MD5 Toolkit by J. Black, M. Cochran and T. Highland
Resources
Code of conduct
Security policy
Stars
Watchers
Forks
Packages 0
No packages published