Can MD5 hash be repeated for different passwords causing collision?

7

A co-worker made a Java system that encounters collisions in a series of hash MD5. But he did not stop to analyze the results, made only an assertion that they exist. But I would like to take this away from the proof, does anyone know of the existence of two different passwords that could have the same MD5 hash?

The Wikipedia speaks more about collisions and vulnerabilities.

To clarify the problem, something that seems simple to understand, but that does not seem to have been very clear, what I want is the two distinct values that caused the same MD5 hash, eg typing in Workbench:

set @valor1 := 'senha_distinta1????';
set @valor2 := 'senha_distinta2????';
SELECT MD5(@valor1) AS hashA, MD5(@valor2) AS hashB;

where: hashA = hashB

Obs: I tried to put the following input values to test the suggested examples and both returned in different hash values:

set @a:= "d131dd02c5e6eec4693d9a0698aff95c 2fcab58712467eab4004583eb8fb7f89
55ad340609f4b30283e488832571415a 085125e8f7cdc99fd91dbdf280373c5b
d8823e3156348f5bae6dacd436c919c6 dd53e2b487da03fd02396306d248cda0
e99f33420f577ee8ce54b67080a80d1e c69821bcb6a8839396f9652b6ff72a70";

set @b:= "d131dd02c5e6eec4693d9a0698aff95c 2fcab50712467eab4004583eb8fb7f89
55ad340609f4b30283e4888325f1415a 085125e8f7cdc99fd91dbd7280373c5b
d8823e3156348f5bae6dacd436c919c6 dd53e23487da03fd02396306d248cda0
e99f33420f577ee8ce54b67080280d1e c69821bcb6a8839396f965ab6ff72a70";

set @a:=Replace(@a," ","");
set @a:=Replace(@a,"
","");
set @b:=Replace(@b,"
","");
set @b:=Replace(@b," ","");

SELECT  MD5(@a) as hashA, MD5(@b) as hashB, @a as valorA, @b as valorB;
    
asked by anonymous 19.11.2015 / 14:01

1 answer

13

There are not only examples but also a site that generates collisions for you :

  

the two blocks

d131dd02c5e6eec4693d9a0698aff95c 2fcab58712467eab4004583eb8fb7f89
55ad340609f4b30283e488832571415a 085125e8f7cdc99fd91dbdf280373c5b
d8823e3156348f5bae6dacd436c919c6 dd53e2b487da03fd02396306d248cda0
e99f33420f577ee8ce54b67080a80d1e c69821bcb6a8839396f9652b6ff72a70
     

and

d131dd02c5e6eec4693d9a0698aff95c 2fcab50712467eab4004583eb8fb7f89
55ad340609f4b30283e4888325f1415a 085125e8f7cdc99fd91dbd7280373c5b
d8823e3156348f5bae6dacd436c919c6 dd53e23487da03fd02396306d248cda0
e99f33420f577ee8ce54b67080280d1e c69821bcb6a8839396f965ab6ff72a70
     

produce an MD5 collision.

     

Each of these blocks has the% MD5% hash.

Font

Example in ideone

If you refer to two short strings with the same MD5, I find it quite unlikely that it exists. The output of the MD5 is 128 bits. This means that if you generate 2 128 + 1 distinct strings then guaranteed you will have a collision ( Principle of the House of the Pigeons ). But the set of Unicode strings in the BMP of size up to 7, together, give less than 2 128 -1, so it is possible that there is not a single collision in that set.

In any case, the problem of using MD5 to protect passwords is not their vulnerability to collisions (because for passwords what matters is resistance to the second pre-image ), and MD5 does not was broken), but rather the fact that it was very fast (a

Update: A point that seems to be causing confusion is that the entry an MD5 hash expects is a binary string, not a string. So, even if its input is a string, it would first have to be converted to a binary string (via encoding or encoding ) before being sent as input to the MD5 algorithm. Many libraries and functions do this automatically for you, so it seems that MD5 accepts input strings, when it does not actually accept.

The cited examples are two binary sequences, expressed in hexadecimal. it is not a simple matter of using the above text as input to the hash, you have to interpret it correctly:

0xd1 == 209 == 1101 0001

What is different from string 79054025255fb1a26e4bc422aef54eb4 :

"d1" == 0x64 0x31 == 0110 0100 0011 0001

Notice how the outputs are different. Done that way, the hashes will also be different ...

If you want / need to put the above example in a string, the way is to use a hexadecimal code for each pair of characters. You did not say which platform you are using , but an example in Python would be: ( Edit: was bad, I did not know what this Workbench was ... see < a href="https://en.stackoverflow.com/questions/99194/o-hash-do-md5-can-repeat-to-different-passwords-using-collection%203244_99194"> o TobyMosque review for an example)

a = "\xd1\x31\xdd\x02\xc5...\x70\x80\xa8\x0d...\x70"
print(hashlib.md5(a).hexdigest())

b = "\xd1\x31\xdd\x02\xc5...\x70\x80\x28\x0d...\x70"
print(hashlib.md5(b).hexdigest())

Or, easier, is to convert the data above hexadecimal to binary and test directly in binary (as I did in example of ideone mentioned above).

    
19.11.2015 / 15:36