2

I have a bit of a strange problem. I'm running a web API on ASP.NET Core 3.1. One thing the API does is to generate a SHA256 from an input string

private static HashAlgorithm algorithm = SHA256.Create();
public static byte[] GetSha256Hash(string inputString)
{
    return algorithm.ComputeHash(Encoding.UTF8.GetBytes(inputString));
}

public static string GetSha256HashString(string inputString)
{
    var result = GetSha256Hash(inputString);
    // Return as hexadecimal string
    return string.Join(
        string.Empty,
        result.Select(x => x.ToString("x2")));
}


player.Id = GetSha256HashString($"{player.Name}-{player.Region}-{player.Country}");

I check that the fields player.Name, region and country and not null before I do the hashing. But sometimes, mostly seems to happen when the server is under heavy load (~100 requests/second), for a few requests (about ~0.5%) it generates

0000000000000000000000000000000000000000000000000000000000000000 for the player.id. Usually they look something like this a7da37b8a57ea714ec11f92c0025d6e654a483144eb68b829d388ea91eb81c79

Even if all the fields would be empty, the three dashes alone would ensure that this should not happen.

I do not see any exceptions and I am a bit out of ideas how to debug this. Any help is welcome! Let me know if I need to add any more information.

Env: Running in Docker on Linux.

silent
  • 14,494
  • 4
  • 46
  • 86
  • Could you specify at least an approximation for what you mean by "heavy load" ? – Pac0 Feb 14 '20 at 13:35
  • something around 100 requests/second – silent Feb 14 '20 at 13:36
  • But I'm not even sure if it is related to the load. I just observe it during my load test – silent Feb 14 '20 at 13:36
  • 1
    Yes. Even though technically 000....000 could be a hash, it looks more like a bug somewhere (either in the libraries used or from your program), since the probability would be very low for this specific hash. – Pac0 Feb 14 '20 at 13:38
  • yes, thats my assumption as well – silent Feb 14 '20 at 13:38
  • 2
    100rq/s are not heavy load, they are non measurable. Can you log the input string when sha256 is all 0's? – TomTom Feb 14 '20 at 13:38
  • 1
    yes I log the string. It looks all good as it should – silent Feb 14 '20 at 13:39
  • Try feeding that input string via this SHA-generating code with a debugger session connected and see if that's repeateable result for this specific input – quetzalcoatl Feb 14 '20 at 13:40
  • Maybe something else could overwrite Player.id after the hash is calculated ? Some race condition could be nasty here, and the "heavyload" would make the race condition more probable. (for instance, I'm thinking about some `async` Tasks not `await`ed) – Pac0 Feb 14 '20 at 13:40
  • @quetzalcoatl I did that already. When I run it again, all works fine – silent Feb 14 '20 at 13:41
  • @Pac0 no, the very next thing I do (by now) is to check if the string is all zeros – silent Feb 14 '20 at 13:41
  • 4
    I suppose your problem is a duplicate of https://stackoverflow.com/questions/26592596/why-does-sha1-computehash-fail-under-high-load-with-many-threads - in short, these objects _**are not thread safe**_ and you've got them cached as `static`. Separate those objects between threads, and the problem will probably be gone. Please let us know if it indeed was this issue. We can then close it as a duplicate. – quetzalcoatl Feb 14 '20 at 13:41
  • Don't use a static field to hold this, construct it locally when needed. These objects are not meant to be used by multiple threads. If you need to cache them, use a pooling system and grab one *exclusively* when needed and return it to the pool when you're done with it. – Lasse V. Karlsen Feb 14 '20 at 13:42
  • ah that makes sense. I kinda suspected that. I'll give that a try and report back. thanks! – silent Feb 14 '20 at 13:43
  • HA! good catch, I was confused by all the static here, but the hash is calculated with a NON-THREAD SAFE instance, not by a static method from cryptography. – Pac0 Feb 14 '20 at 13:43
  • And the static field means it is shared between multiple threads. – Lasse V. Karlsen Feb 14 '20 at 13:44
  • @LasseV.Karlsen which part should not be static? only this `private static HashAlgorithm algorithm`? I.e. the methods can be static?! – silent Feb 14 '20 at 13:44
  • Yes, the hash algo should be per-thread (or you can synchronize access to that object, but I suppose it's heavier than having separate instances) – quetzalcoatl Feb 14 '20 at 13:44
  • 3
    It does not matter how you store it. You cannot reuse the same instance on two different threads at the same time. Removing `static` will not necessarily fix anything, it's not that keyword that is the problem, it is that an instance is used by more than one thread at the same time. Any system you use to make sure you don't do that will do. You can go all the way from constructing a new, fresh, local, instance on-demand every time up to some form of reuse system, as long as each instance is used exclusively by only one thread at any given time. – Lasse V. Karlsen Feb 14 '20 at 13:44
  • got it. Any quick idea if this `using (var algorithm = SHA256.Create()) return algorithm.ComputeHash(...)` is a good idea performance-wise or is the Create() operation so heavy that it's better to re-use in a thread-safe way? – silent Feb 14 '20 at 13:47
  • @LasseV.Karlsen quick tests confirms that this indeed solved it! Thanks so much. If you want to make an answer out of it, I'll accept it – silent Feb 14 '20 at 13:59
  • 1
    ah, actually it was @quetzalcoatl who found it :) – silent Feb 14 '20 at 14:00

2 Answers2

4

The object you are setting as:

private static HashAlgorithm algorithm = SHA256.Create();

is not thread safe.

In your case it really depends on the accessor, instance or static and also the runtime (OS, netcore, etc.)

To be sure that it is really working without having troubles and not creating different / wrong hashes for the same source you have to create a new SHA256.Create() all the time you want to create a hash.

public static string GenerateSHA256(string input)
{
    var bytes = Encoding.Unicode.GetBytes(input);
    using (var hashEngine = SHA256.Create())
    {
        var hashedBytes = hashEngine.ComputeHash(bytes, 0, bytes.Length);
        var sb = new StringBuilder();
        foreach (var b in hashedBytes)
        {
            var hex = b.ToString("x2");
            sb.Append(hex);
        }
        return sb.ToString();
    }
}

You can also find this in some documentations of Microsoft.

Any public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.

.NET Fiddle

SO: instance-members-thread-unsafe-vs-public-static

Git: Discussion about this topic

Martin
  • 3,096
  • 1
  • 26
  • 46
-1
public static string GetSha256HashString(string inputString)
{
    SHA256CryptoServiceProvider sha256 = new SHA256CryptoServiceProvider();
    byte[] B1;
    byte[] B2;
    B1 = UTF8Encoding.UTF8.GetBytes(inputString.Trim());
    B2 = sha256.ComputeHash(B1);
    string HashString = BitConverter.ToString(B2);
    return HashString;
}
  • 7
    Code-only answers without explanation is discouraged here on Stack Overflow. Please provide a textual description of what you have done and why as well. – Lasse V. Karlsen Feb 14 '20 at 14:04
  • 5
    `SHA256CryptoServiceProvider` implements `IDisposable` so you should use a `using` block to ensure it is disposed correctly. – Alsty Feb 14 '20 at 14:08
  • Yup. What Alsty said! Especially as it may contain internal reference to CAPI or other unmanaged resources. – quetzalcoatl Feb 14 '20 at 14:12
  • @homayon you did not participate in the discussion above. So I’m sorry, but I won’t accept your answer – silent Feb 14 '20 at 14:43
  • 2
    Welcome to Stack Overflow! While this code may solve the question, [including an explanation](//meta.stackexchange.com/q/114762) of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please [edit] your answer to add explanations and give an indication of what limitations and assumptions apply. [From Review](https://stackoverflow.com/review/low-quality-posts/25356651) – double-beep Feb 14 '20 at 20:40