0

** Problem solved. My input was altered on its way from the EditText to Java where I run the function. It was really obvious too. I feel really dumb. I'm really sorry for wasting everyone's time...

I'm trying to get android to encrypt a string using MD5 and have found that it does not match the results from md5 functions used in MySQL and PHP.

For example:

PHP/ MYSQL: string: password hash: 5f4dcc3b5aa765d61d8327deb882cf99

Android: string: password hash: bc4a7f3b32b2a85688a53c49df19cd95

I've searched for and looked at people on stackoverflow having the same problem but I still haven't found an answer. I've used numerous methods and tried changing character codes but it still never matches up.

Here's the function I currently have saved in my project (this doesn't work):

public static String md5(String input){
        String result = input;
        if(input != null) {
            MessageDigest md;
            try {
                md = MessageDigest.getInstance("MD5");
            md.update(input.getBytes());
            BigInteger hash = new BigInteger(1, md.digest());
            result = hash.toString(16);
            if ((result.length() % 2) != 0) {
                result = "0" + result;
            } 
            }catch (NoSuchAlgorithmException e) {
                e.printStackTrace();
                return null;
            }
        }
        return result;
    }

Any help would be greatly appreciated.

Gumbo
  • 643,351
  • 109
  • 780
  • 844
lancex
  • 565
  • 4
  • 10
  • 19
  • 1
    Hmm... when I run your code in a standard Sun JVM (1.6) I get the correct output. I don't think it has anything to do with `input.getBytes()` since the default charset on Android should be UTF-8. It may have something to do with the `MessageDigest`. Nonetheless, I would try it using `getBytes("UTF-8")` to be explicit. – Peter Feb 10 '12 at 19:46
  • Did you try to use the same string encoding when you changing string to byte array? – Selvin Feb 10 '12 at 19:47
  • 1
    Snippet looks okay expect from relying on platform default encoding (although Android uses UTF-8 already). Are you sure that you haven't some whitespace in the input? For another snippet, see also this: http://stackoverflow.com/questions/5494447/what-will-be-the-android-java-equivalent-of-md5-function-in-php – BalusC Feb 10 '12 at 19:48
  • It seems my input did change on its way from grabbing it from the EditText into Java... I feel really dumb. – lancex Feb 10 '12 at 20:01
  • 1
    Is this used for password hashing? Plain md5/sha1/sha2 are a bad idea for password hashing. – CodesInChaos Feb 13 '12 at 16:38
  • It is, but it's only for a project that will never ever be used publicly, so I'm okay with it. – lancex Feb 14 '12 at 20:01

2 Answers2

1

Two things... firstly, I'd avoid using BigInteger for this. You want to convert a byte array to a hex representation - so use code designed to do exactly that, such as Apache Common Codec. It'll stop you from chasing your tail over conversion issues when you want to focus on what the MD5 output is. EDIT: Okay, it sounds like the code wasn't the problem, getting the input was - but I would still make the change here. It should end up with more readable code which expresses what you're trying to achieve more clearly.

Secondly, this code:

md.update(input.getBytes());

... is using the platform default encoding. That's almost never a good idea. Specify the encoding explicitly, even if you know what the platform default encoding is. In this case it's harmless, but you should fix the code anyway, in case you ever have to deal with non-ASCII text.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • 1
    The example input contains ASCII compatible characters only, so not specifying the character encoding isn't the cause of OP's concrete problem. Further you are not really explaining why `BigInteger` is a bad idea, I'd be curious to that. – BalusC Feb 10 '12 at 19:55
  • @BalusC: While in this particular case it's only got ASCII characters, leaving it using the default encoding is a bug waiting to happen, so it would be remiss of me not to mention it. I *suspect* the problem is with `BigInteger` here - I don't know why it would fail (it works for me on desktop Java) but fundamentally it just doesn't seem like a sensible way of doing something which isn't about big integers - it's about a hex representation of bytes. – Jon Skeet Feb 10 '12 at 19:57
  • Well, the answer implied that it's one of the two possible causes of OP's concrete problem (which is thus untrue in this particular case). Okay, you were thus plain guessing for the `BigInteger` being the probable cause. I still wonder the technical reason why it would fail as it should work just fine even though it's abuse of the `BigInteger` class. – BalusC Feb 10 '12 at 20:02
  • @BalusC: Having seen the comment, it looks like it was a different problem entirely - but I'd still make both changes :) Have edited to reflect that. – Jon Skeet Feb 10 '12 at 20:05
  • My original function did convert a byte array into hex but after it wouldn't work I started trying anything I could find. Turns out my input was altered. I feel really bad about posting this now... – lancex Feb 10 '12 at 20:08
  • What about edians(the big ones and little ones) – Selvin Feb 10 '12 at 20:24
  • @Selvin: I'm not sure it should have any effect here, but the fact that we'd have to *ask* is another reason in favour of using a simple byte[] to hex conversion library. – Jon Skeet Feb 10 '12 at 20:26
0

The problem is most likely due to the character encoding. I know you've said you tried different character codes, but your code posted doesn't have it.

Check the answers to this question: How can I generate an MD5 hash?

The MessageDigest class can provide you with an instance of the MD5 digest.

Always when working with strings and the crypto classes be sure to always specify the encoding you want the byte representation in. If you just use string.getBytes() it will use the platform default. (Not all platforms use the same defaults)

byte[] bytesOfMessage = yourString.getBytes("UTF-8");

Community
  • 1
  • 1
CLo
  • 3,650
  • 3
  • 26
  • 44
  • While you're completely right, it shouldn't matter for a word containing ASCII characters only. It yields the same bytes in practically every character encoding (expect of the ancient ones competing to ASCII like EBCDIC). Only characters beyond ASCII range may give different bytes depending on the character encoding used. – BalusC Feb 10 '12 at 19:53