0

I have a C# code that gets a hash of provided strings using either MD5 or SHA1 algorithm. It's not a problem in itself, but here it is for a reference:

public static string GetMD5(Encoding encoding, params string[] components)
{
    return GetHashedString(HashingMethod.MD5, encoding, components);
}

public static string GetSHA1(Encoding encoding, params string[] components)
{
    return GetHashedString(HashingMethod.SHA1, encoding, components);
}

private static string GetHashedString(HashingMethod method, Encoding encoding, params string[] components)
{
    HashAlgorithm algorithm = null;

    switch (method)
    {
        case HashingMethod.MD5:
            algorithm = new MD5CryptoServiceProvider();
            break;
        case HashingMethod.SHA1:
            algorithm = new SHA1CryptoServiceProvider();
            break;
    }

    StringBuilder data = new StringBuilder();
    foreach (string param in components)
        data.Append(param);

    byte[] bytes = encoding.GetBytes(data.ToString());
    bytes = algorithm.ComputeHash(bytes);

    StringBuilder result = new StringBuilder();
    foreach (byte b in bytes)
        result.AppendFormat("{0:x2}", b);

    return result.ToString();
}

private enum HashingMethod { MD5, SHA1 }

Now, the real problem is which encoding am I supposed to pass to GetMD5 and GetSHA1 methods in order for them to return the same results as PHP's md5() and sha1()? I can't alter the PHP code and I don't even have an access to it, I'm just getting a hash signature and I know it's created with PHP. What I can alter is my C# code, if necessary.

I've looked around the internet and the answers I found vary. According to them for SHA1 I should use ASCII encoding and for MD5 I have no idea (my own tests seem to point to UTF8 though).

I must admit I know almost nothing about PHP. Are encodings used in md5() and sha1() always the same (and if so, which ones)? Or maybe it's possible to somehow alter them - not by using some kind of a wrapper method transforming the string beforehand, but by changing the encoding used inside md5() and sha1()? In other words can I expect specific encoding for each method or can it vary?

EDIT

Let's cut the number of possibilities a bit since my question might have been too general and let's say that the PHP hashing code looks like this:

$hash = sha1($str)

where $str is a normal string, i.e. no Base64 applied, no additional hash algorithms used etc. What encoding would I have to pass to my GetSHA1 method to have the same output as the above PHP line produces? Is it even possible to determine? Same conditions and questions for PHP's md5() and my GetMD5.

S_F
  • 877
  • 7
  • 17
  • The functions `MD5()` and `SHA1()` in PHP provides an optional parameter: `raw output`. If set to true, then you'll get a value returned in raw binary format. Now developers may encode it in base64 etc... As for the encoding, take a look at [this question](http://stackoverflow.com/q/9351694). It all depends on the configuration. I shall say, your best bet is to request access to that code/environment since there are a lot of parameters that could change the outcome drastically. Otherwise we would just be guessing :) – HamZa Jul 17 '13 at 09:27

1 Answers1

0

PHP itself does not work with encoding, any string is what your bytes variable is in your GetHashedString method. Given this, the encoding depends on what the source of this variable is. If it's utf-8 encoded file, it will be utf-8, and will also include BOM if there's any.

Marek
  • 7,337
  • 1
  • 22
  • 33