26

I' m writting a password generator for a script in PHP and I want it to be compatible with a class I wrote in Java, so that they can share resources.

PHP code:

public function PasswordGen($password, $rounds) { 
    for($i = 0; $i < $rounds; $i++) { 
        $password = substr(base64_encode(md5($password)), 0, 16); 
        echo $i . " " . $password . PHP_EOL; // debugging //
    }
    return $password;
}

Java code:

public static String PasswordGen(String password, int rounds) {
    try { 
        for(int i = 0; i < rounds; i++) { 
            byte[] md5 = MessageDigest.getInstance("MD5").digest(password.getBytes("UTF-8"));
            String md5h = (new BigInteger(1, md5)).toString(16);
            password = Base64.getEncoder().encodeToString(md5h.getBytes()).substring(0, 16);
            System.out.println(Integer.toString(i) + " " + password); // debugging //
        }
    } catch(Exception ex) {
        ex.printStackTrace();
        return null;
    }
    return password;
}

PHP debug output:

0 MWExZGM5MWM5MDcz  
1 NDVkZmMxNWVjNWZi  
2 ODY5YzVkODBhNTRh  
3 ZGE2OTNiOWMxOWM1  
4 OTcxMTY3MzgxMmRk  
5 NWNjNDI2N2IzMDlj  
6 NGVkYzY0YjVkMWUy  
7 MjdhMGU4NjhhNmU3  
8 OWY5OGE3ZGZiODZl  
9 Y2I1ZjBkNjRmMjkx  
10 YTk5NDA1MGI1OWY1  
11 YzRmYWE5ZTk0ZDdl  
12 NDBiZWZkNmQ5Yjhj  
13 MzQyNzcwNGRjMTYw  
14 N2U4ZmUxOGMyNWYx  
15 MjBjOTZhNGE4ZDQ1  
16 MjdmMzkwMzI0NDdj  
17 YjM4NDI0YWU0YzUw  
18 NDRiNjA1MWUwOGZi  
19 MGI1YmIyMDViMGYz  

Java debug output:

0 MWExZGM5MWM5MDcz  
1 NDVkZmMxNWVjNWZi  
2 ODY5YzVkODBhNTRh  
3 ZGE2OTNiOWMxOWM1  
4 OTcxMTY3MzgxMmRk  
5 NWNjNDI2N2IzMDlj  
6 NGVkYzY0YjVkMWUy  
7 MjdhMGU4NjhhNmU3  
8 OWY5OGE3ZGZiODZl  
9 Y2I1ZjBkNjRmMjkx  
10 YTk5NDA1MGI1OWY1  
11 YzRmYWE5ZTk0ZDdl  
12 NDBiZWZkNmQ5Yjhj  
13 MzQyNzcwNGRjMTYw  
14 N2U4ZmUxOGMyNWYx  
15 MjBjOTZhNGE4ZDQ1  
16 MjdmMzkwMzI0NDdj  
17 YjM4NDI0YWU0YzUw  
18 NDRiNjA1MWUwOGZi  
19 YjViYjIwNWIwZjMy  

It works as expected until the 19th loop. Why does it produce different output after that?

t.m.adam
  • 15,106
  • 3
  • 32
  • 52
  • You may want to look at [How do you create "good" random md5 hashes in php?](http://stackoverflow.com/a/24037589/1255289) – miken32 Apr 04 '17 at 03:36
  • 1
    I don't know your key so I can't test it, but your issue could possibly be caused by this: `md5h.getBytes()`. In the fourth line of the Java code that you posted, you're using an encoding of UTF-8, but not in line 6. – Jacob G. Apr 04 '17 at 03:37
  • Give the key, so that we can try to reproduce. – Erwin Bolwidt Apr 04 '17 at 03:43
  • @JacobG. I don't think that could be the issue, because at that point, `key` consists only of ASCII characters. – Dawood ibn Kareem Apr 04 '17 at 03:44
  • 1
    @ErwinBolwidt You don't need the original key if you want to reproduce this. You can start at the 18th iteration (`NDRiNjA1MWUwOGZi`) and iterate just once. – Dawood ibn Kareem Apr 04 '17 at 03:45
  • dont you risk collisions cutting the hash in half? –  Apr 04 '17 at 03:47
  • The Javadoc for Base64#getEncoder#encodeToString states that it creates a String with the ISO-8859-1 charset, but I don't see why that would mess up your output. – Jacob G. Apr 04 '17 at 03:59
  • @DavidWallace Good point. Being able to reproduce it made it much easier to fix :) – Erwin Bolwidt Apr 04 '17 at 04:06
  • @nogad It's cutting off more than half, because it takes 16 characters of the base64 encoded output. Each base64 character encodes 6 bits, so 16 base64 characters encde (6/8*16) = 12 bytes of the originally 32 bytes hash. – Erwin Bolwidt Apr 04 '17 at 04:14
  • Ooh, yeah, that's not good. As `rounds` gets bigger, these keys get very much weaker. – Dawood ibn Kareem Apr 04 '17 at 04:17
  • What would you suggest for higher entropy ? – t.m.adam Apr 04 '17 at 04:53
  • When you printed `md5h`, what did you find? Proper debugging goes a long way. – Jason C Apr 04 '17 at 13:20
  • @t.m.adam "entropy" is not the right word for what you're looking for, but my suggestion would be to use a standardized key derivation function. PBKDF2, Scrypt, and Argon2 are good examples. (Those three are often brought up in the context of password hashing, but they all support acting as a KDF as well. Bcrypt, another well-known password hashing function, does not) –  Apr 04 '17 at 13:49
  • @JasonC Your duplicate is in no way a duplicate of this question, just because the *final conclusion* of the answer happens to be the same. There is no similarity between the questions or the bulk of the answer. In other words, you could not possibly have selected that duplicate from just reading the question - you only selected it because you read my answer. Which proves that it is not a duplicate. – Erwin Bolwidt Apr 04 '17 at 14:57
  • @ErwinBolwidt In fact: This question is not a bad one. It can still receive votes. It received a good answer and requires no more that can't be posted on the link instead. The answer here can also still receive votes. And now this question is a sign post for anybody else with the same issue (and the issue was, actually, precisely the same, and in the same situation no less -- incorrect hash due to leading zeros trimmed by `BigDecimal` string conversions in Java -- the presence of PHP was irrelevant) ... – Jason C Apr 04 '17 at 15:03
  • ... Having this closed as a dupe therefore causes only benefit and zero harm. It's an improvement to the organization of information on this site. I think in its current state, everybody is winning here. (Also consider that had this been closed as a duplicate *immediately*, it would have been satisfactory then, as well, albeit potentially requiring a trivially tiny moment of "ah, I see what happened" on the OP's part.) And again to be clear: This is a fine, well-written question that completely deserves the positive attention it has received. Closure does not equate to "bad". – Jason C Apr 04 '17 at 15:12
  • @JasonC Ok, thanks for the explanation – Erwin Bolwidt Apr 04 '17 at 15:22

1 Answers1

31

In Java, converting a BigDecimal to a hexadecimal String with the toString(int base) method doesn't output leading zeros.

You can discover this by printing the output of the intermediate step (converting the md5 hashcode to an hexadecimal string) - in Java that gives b5bb205b0f32a7bf2a80fc870cbd2b7 while in PHP it gives 0b5bb205b0f32a7bf2a80fc870cbd2b7. It's only a difference of one leading zero, but after applying the base64 encoding, they look very different.

An easier way to get leading zeros is to use the String.format method.

Replace this line:

String md5h = ( new BigInteger(1, md5) ).toString(16);

with this line:

String md5h = String.format("%032x", new BigInteger(1, md5));

and you'll get the same output as with your php code.

Erwin Bolwidt
  • 30,799
  • 15
  • 56
  • 79