4

I have a string that I need to hash in order to access an API. The API-creator has provided a codesnippet in Python, which hashes the code like this:

hashed_string = hashlib.sha1(string_to_hash).hexdigest()

When using this hashed string to access the API, everything is fine. I have tried to get the same hashed string result in C#, but without success. I have tried incredibly many ways but nothing has worked so far. I am aware about the hexdigest part aswell and I have kept that in mind when trying to mimic the behaviour.

Does anyone know how to get the same result in C#?

EDIT: This is one of the many ways I have tried to reproduce the same result in C#:

public string Hash(string input)
{
    using (SHA1Managed sha1 = new SHA1Managed())
    {
        var hash = sha1.ComputeHash(Encoding.UTF8.GetBytes(input));
        var sb = new StringBuilder(hash.Length * 2);

        foreach (byte b in hash)
        {
            sb.Append(b.ToString("X2"));
        }

        return sb.ToString().ToLower();
    }
}

This code is taken from: Hashing with SHA1 Algorithm in C#

Another way

public string ToHexString(string myString)
{
    HMACSHA1 hmSha1 = new HMACSHA1();
    Byte[] hashMe = new ASCIIEncoding().GetBytes(myString);
    Byte[] hmBytes = hmSha1.ComputeHash(hashMe);
    StringBuilder hex = new StringBuilder(hmBytes.Length * 2);
    foreach (byte b in hmBytes)
    {
        hex.AppendFormat("{0:x2}", b);
    }
    return hex.ToString();
}

This code is taken from: Python hmac and C# hmac

EDIT 2

Some input/output:

C# (using second method provided in above description)

input: callerId1495610997apiKey3*_&E#N@B1)O)-1Y

output: 1ecded2b66e152f0965adb96727d96b8f5db588a


Python

input: callerId1495610997apiKey3*_&E#N@B1)O)-1Y

output: bf11a12bbac84737a39152048e299fa54710d24e


C# (using first method provided in above description)

input: callerId1495611935​apiKey{[B{+%P)s;WD5&5x

output: 7e81e0d40ff83faf1173394930443654a2b39cb3


Python

input: callerId1495611935​apiKey{[B{+%P)s;WD5&5x

output: 512158bbdbc78b1f25f67e963fefdc8b6cbcd741

Markus Olsson
  • 197
  • 4
  • 14
  • 4
    Please show a [mcve] - my guess is that there's a bug in your C# code, but as we can't see any of your C# code, it's hard to tell. – Jon Skeet May 24 '17 at 07:17
  • Done! @JonSkeet – Markus Olsson May 24 '17 at 07:24
  • Well that's still not a [mcve] - we don't know what your input is, or the output in Python, or the output from either C# method. It's getting close though... – Jon Skeet May 24 '17 at 07:25
  • Let me add some more then! :) – Markus Olsson May 24 '17 at 07:26
  • 1
    You didn't bother including the results of your *first* C# code, which is the same as your python results, just in upper case... – Jon Skeet May 24 '17 at 07:44
  • (SHA-1 and SHA-1-HMAC are not the same thing.) – Jon Skeet May 24 '17 at 07:44
  • Added .ToLower() in the method and printed results! – Markus Olsson May 24 '17 at 07:49
  • Sorry for the lack of information, thought and hoped that the answer would just be to use a specific class/method to get the same type of hash. – Markus Olsson May 24 '17 at 07:55
  • Why aren't you using the same input for all cases? You're making it pointlessly tricky... – Jon Skeet May 24 '17 at 08:08
  • And *neither* your Python nor C# results are reproducible for me for input "callerId1495611935​apiKey{[B{+%P)s;WD5&5x" - with both Python and C# the result is "780bc18524b24644c09cd6348fd0a5d0894a8c18" on my box. If you provided a genuinely *complete* piece of code for both Python and C#, with the hard-coded data, it would be easier to reproduce your results. (I don't think you're hashing what you think you are...) – Jon Skeet May 24 '17 at 08:10
  • I feel like the input string is unimportant aslong as its the same one used in both C# and python. – Markus Olsson May 24 '17 at 08:11
  • Except it isn't, because you've got two of them for no obvious reason... and it's not clear that the inputs you're claiming are the actual inputs anyway, given that the results you're getting are inconsistent with the results I'm seeing for supposedly the same code. This is where it's important to have a genuinely [mcve] that we can copy/paste/compile/run with no other input. – Jon Skeet May 24 '17 at 08:13
  • (I'd advise using `ToLowerInvariant` though in the C# code, to avoid issues in Turkey...) – Jon Skeet May 24 '17 at 08:13
  • 2
    As noted in comments on Federico's answer, you've got a non-ascii character in the input for the bottom two examples. I suspect that's not meant to be there, is it? – Jon Skeet May 24 '17 at 08:23
  • @JonSkeet Correct! I erased the string and retyped it manually and it worked. Thanks for helping! :) – Markus Olsson May 24 '17 at 09:22
  • 1
    Please take note that if you'd provided a [mcve] from the start, you may well have found the problem before even hitting "Post". Worth remembering for next time... – Jon Skeet May 24 '17 at 09:23

1 Answers1

7

C#:

public static string Hash(string input)
{
    using (SHA1Managed sha1 = new SHA1Managed())
    {
        var hash = sha1.ComputeHash(Encoding.UTF8.GetBytes(input));
        var sb = new StringBuilder(hash.Length * 2);

        foreach (byte b in hash)
        {
            sb.Append(b.ToString("x2")); // x2 is lowercase
        }

        return sb.ToString().ToLower();
    }
}

public static void Main()
{
    var x  ="callerId1495611935​apiKey{[B{+%P)s;WD5&5x";
    Console.WriteLine(Hash(x)); // prints 7e81e0d40ff83faf1173394930443654a2b39cb3
}

Python

import hashlib
s = u'callerId1495611935​apiKey{[B{+%P)s;WD5&5x'
enc = s.encode('utf-8') # encode in utf8
hash = hashlib.sha1(enc)
formatted = h.hexdigest()
print(formatted) # prints 7e81e0d40ff83faf1173394930443654a2b39cb3

Your main problem is that you are using different encodings for the same string in C# and Python. Use UTF8 in both languages and use the same casing. The output is the same.

Note that inside your input string (between callerId1495611935 and apiKey{[B{+%P)s;WD5&5x) there is an hidden \u200b character. That's why encoding your string in UTF-8 gives a different result than encoding it using ASCII. Does that character have to be inside your string?

Federico Dipuma
  • 17,655
  • 4
  • 39
  • 56
  • I'm all for being explicit in general, but on my box at least it makes no difference (which is what I'd expect, given that the input is all-ASCII). `hashlib.sha1(s).hexdigest()` gives the same result. – Jon Skeet May 24 '17 at 08:12
  • @JonSkeet If you try to copy and paste OP original input string you'll notice that between `callerId1495611935` and `apiKey{[B{+%P)s;WD5&5x` there is an hidden `\u200b` character (don't ask me why). – Federico Dipuma May 24 '17 at 08:21
  • Urgh - well spotted. that would explain things. I suspect that *shouldn't* be in the input. – Jon Skeet May 24 '17 at 08:22
  • @JonSkeet I agree. – Federico Dipuma May 24 '17 at 08:23
  • Thanks so much for this, very good catch! And thanks JonSkeet for your help aswell, i'll try to fix this. :) – Markus Olsson May 24 '17 at 08:39
  • I can confirm that this was the issue. To fix this I erased the strings and retyped them manually back and everything went fine. Amazing that you found this issue, really. Thank you so much!! @FedericoDipuma – Markus Olsson May 24 '17 at 09:16