1666

In .NET, the GetHashCode method is used in a lot of places throughout the .NET base class libraries. Implementing it properly is especially important to find items quickly in a collection or when determining equality.

Is there a standard algorithm or best practice on how to implement GetHashCode for my custom classes so I don't degrade performance?

poke
  • 369,085
  • 72
  • 557
  • 602
bitbonk
  • 48,890
  • 37
  • 186
  • 278
  • 48
    After reading this question and the article below, i could implement override of `GetHashCode`. I hope it would be helpful for others. [Guidelines and rules for GetHashCode written by Eric Lippert](http://blogs.msdn.com/b/ericlippert/archive/2011/02/28/guidelines-and-rules-for-gethashcode.aspx) – rene Mar 22 '12 at 21:59
  • 6
    "or to determine equality": no! Two objects with the same hashcode are not necessarily equal. – Thomas Levesque Sep 02 '15 at 22:03
  • 4
    @ThomasLevesque You are right, two objects with the same hash code are not necessarily equal. But still `GetHashCode()` is used in very many implementations of `Equals()`. That's what I meant with that statement. `GetHashCode()` inside `Equals()` is often used as a shortcut to determine *inequality*, because if two objects have a *different* hash code they have to be objects that are not equal and the rest of the equality-check doesn't have to executed. – bitbonk Sep 02 '15 at 22:27
  • 7
    @bitbonk Usually, both `GetHashCode()` and `Equals()` need to look at all fields of both objects (Equals has to do this if it the hashcodes are equal or not-checked). Because of this, a call to `GetHashCode()` inside `Equals()` is often redundant and could reduce performance. `Equals()` may also be able to short circuit, making it much faster - however in some cases the hashcodes may be cached, making the `GetHashCode()` check faster and so worthwhile. See [this question](http://stackoverflow.com/questions/37724752/) for more. – NotEnoughData Apr 02 '17 at 03:52
  • 12
    UPDATE JAN 2020: Eric Lippert's blog located at: https://learn.microsoft.com/en-us/archive/blogs/ericlippert/guidelines-and-rules-for-gethashcode – Rick Davin Jan 15 '20 at 14:06
  • 3
    UPDATE MAR 2020: The link from @RickDavin is correct, but the the article at learn.microsoft.com has bad formatting. Here's the same article on Eric's blog. https://ericlippert.com/2011/02/28/guidelines-and-rules-for-gethashcode/ – zumalifeguard Mar 19 '20 at 10:29
  • 3
    Now you can just use HashCode.Combine(field1, field2,...) – Ε Г И І И О Apr 23 '20 at 15:23

22 Answers22

1799

I usually go with something like the implementation given in Josh Bloch's fabulous Effective Java. It's fast and creates a pretty good hash which is unlikely to cause collisions. Pick two different prime numbers, e.g. 17 and 23, and do:

public override int GetHashCode()
{
    unchecked // Overflow is fine, just wrap
    {
        int hash = 17;
        // Suitable nullity checks etc, of course :)
        hash = hash * 23 + field1.GetHashCode();
        hash = hash * 23 + field2.GetHashCode();
        hash = hash * 23 + field3.GetHashCode();
        return hash;
    }
}

As noted in comments, you may find it's better to pick a large prime to multiply by instead. Apparently 486187739 is good... and although most examples I've seen with small numbers tend to use primes, there are at least similar algorithms where non-prime numbers are often used. In the not-quite-FNV example later, for example, I've used numbers which apparently work well - but the initial value isn't a prime. (The multiplication constant is prime though. I don't know quite how important that is.)

This is better than the common practice of XORing hashcodes for two main reasons. Suppose we have a type with two int fields:

XorHash(x, x) == XorHash(y, y) == 0 for all x, y
XorHash(x, y) == XorHash(y, x) for all x, y

By the way, the earlier algorithm is the one currently used by the C# compiler for anonymous types.

This page gives quite a few options. I think for most cases the above is "good enough" and it's incredibly easy to remember and get right. The FNV alternative is similarly simple, but uses different constants and XOR instead of ADD as a combining operation. It looks something like the code below, but the normal FNV algorithm operates on individual bytes, so this would require modifying to perform one iteration per byte, instead of per 32-bit hash value. FNV is also designed for variable lengths of data, whereas the way we're using it here is always for the same number of field values. Comments on this answer suggest that the code here doesn't actually work as well (in the sample case tested) as the addition approach above.

// Note: Not quite FNV!
public override int GetHashCode()
{
    unchecked // Overflow is fine, just wrap
    {
        int hash = (int) 2166136261;
        // Suitable nullity checks etc, of course :)
        hash = (hash * 16777619) ^ field1.GetHashCode();
        hash = (hash * 16777619) ^ field2.GetHashCode();
        hash = (hash * 16777619) ^ field3.GetHashCode();
        return hash;
    }
}

Note that one thing to be aware of is that ideally you should prevent your equality-sensitive (and thus hashcode-sensitive) state from changing after adding it to a collection that depends on the hash code.

As per the documentation:

You can override GetHashCode for immutable reference types. In general, for mutable reference types, you should override GetHashCode only if:

  • You can compute the hash code from fields that are not mutable; or
  • You can ensure that the hash code of a mutable object does not change while the object is contained in a collection that relies on its hash code.

The link to the FNV article is broken but here is a copy in the Internet Archive: Eternally Confuzzled - The Art of Hashing

Jalal
  • 6,594
  • 9
  • 63
  • 100
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • 9
    The algorithm described in the book you mention is infact a little more detailed it especailly describes what to do for different data types of the fields. E.g.: for fields of type long use (int)(field ^ f >>> 32) instead of simply calling GetHashcode. Is long.GetHashCodes implemented that way ? – bitbonk Nov 04 '08 at 21:44
  • 15
    Yup, Int64.GetHashCode does exactly that. In Java that would require boxing, of course. That reminds me - time to add a link to the book... – Jon Skeet Nov 04 '08 at 21:51
  • 85
    23 is no good choice, since(as of .net 3.5 SP1) `Dictionary` assumes good distribution modulo certain primes. And 23 is one of them. So if you have a dictionary with Capacity 23 only the last contribution to `GetHashCode` influences the compound hashcode. So I'd rather use 29 instead of 23. – CodesInChaos Nov 21 '10 at 22:41
  • @Ani: Your implementation allocated several new objects on the heap, so the performance might is lower than with a manual implementation. Whether that's acceptable depends or your type and usage. Check some of the other answers for helpers using generics which avoid this problem. – CodesInChaos Nov 21 '10 at 22:43
  • 28
    @CodeInChaos: Only the last contribution influences the bucket - so it might, at worst, have to look through *all 23* entries in the dictionary. It's still going to check the actual hash code of each entry, which will be cheap. If you've got a dictionary that small, it's unlikely to matter much. – Jon Skeet Nov 21 '10 at 23:14
  • 3
    @Jon: I have to ask, despite already having opened [my own question](http://stackoverflow.com/q/4654227/482691) on the topic, but what's a good VB version of this since VB lacks the `checked` and `unchecked` keywords? I tried making tmpHash an Int64 and AND'ing the lower 8 bits (per the [accepted answer](http://stackoverflow.com/questions/4654227/overriding-gethashcode-in-vb-without-checked-unchecked-keyword-support/4656890#4656890) to my question), but on a sufficiently-large enough set of fields, it somehow caused the calculation to wrap to 0 for the remainder of the loop. – Kumba Jan 18 '11 at 05:05
  • 1
    @Kumba: I don't know how I'd do this in VB, I'm afraid. Is arithmetic *always* checked in VB? Could you have a separate class library which you could delegate the arithmetic to, either written in C# or with checked arithmetic turned off project-wide? – Jon Skeet Jan 18 '11 at 07:08
  • 3
    @Jon: VB apparently checks a *lot* of things. It's got a fetish for demanding that unsigned numerics get converted into signed numerics before letting you left or right shift them. Which is driving me up the wall and across the ceiling. I'm trying to implement the Jenkins hash to work around the lack of checked/unchecked (a rotating hash also does the trick, but I'm worried about hash collisions with input). Using a separate, C# library is something I'd like to avoid, because it essentially admits defeat. If I get to that point, I should just re-write the entire project in C#. – Kumba Jan 18 '11 at 07:28
  • 3
    isn't the 'unchecked' unnecessary b/c CLR will happily overflow by default? – pomeroy Jan 18 '11 at 15:22
  • 8
    @pomeroy: It depends on what the project settings are. Basically you give the assembly a default context of checked or unchecked. – Jon Skeet Jan 18 '11 at 15:47
  • 3
    @pomeroy: VB isn't as granular as C# is. Because it lacks the above-two keywords, your only options are to remove nteger overflows for the entire project or not. I guess if your project is complete and generally well tested, removing the overflow checks is a safe thing to do. However, while building it and debugging, those checks are good because they'll highlight bugs to fix. I opened Connect Ticket # 636564 with Microsoft to recommend including checked/unchecked keyword support in the next .NET release. Uncertain if they'll do so, though. – Kumba Jan 18 '11 at 20:38
  • I'll add that I'll have to go with the rotating hash algorithm linked to in Jon's answer above. It doesn't overflow, even on an Int32, doesn't (so far) wrap to 0 on a large number of fields in the calculation, and is simple and rather speedy. The Jenkins hash didn't work out...Even that overflows at random, depending on the input. Also, the forcing of bitshifts happening in signed math get in the way of a lot of things. I might open another bug on that, unless that's intended behavior somehow. – Kumba Jan 18 '11 at 20:41
  • Don't you need `override` in your method declaration? Would also be good to put in null checks since this is such a well-used example. – Rory Feb 05 '11 at 10:16
  • @Rory: I've added the override, thanks - I'm not going to put in the null checks, as I feel that would obscure the important points. The comment suffices IMO. – Jon Skeet Feb 05 '11 at 10:23
  • 1
    Why start with a prime instead of just zero? does `int hash = 17;` have any theoretically supported benefits? – fredoverflow Feb 06 '11 at 14:41
  • 2
    @FredOverflow: I don't know the exact details of all the reasons behind it, but starting with 0 would mean that the hash would stay as zero if the individual field hashes were zero... and that's probably not uncommon (e.g. an integer of value zero will probably hash to zero). Just a guess, but I suspect that having a constant which gets propagated with each field is useful. This is really just copied from Effective Java though :) – Jon Skeet Feb 06 '11 at 16:59
  • @JonSkeet How collision safe would this algortihm be for an complex object graph consisting of let's say 500 objects with each having 10 properties. Related question: http://stackoverflow.com/questions/5308057/generating-an-safe-hashcode-for-an-objectgraph/5308237 – bitbonk Mar 15 '11 at 14:04
  • @bitbonk: The chances of a collision on any single change would be pretty low... but in the question you're talking about I'd probably use a cryptographic hash instead. – Jon Skeet Mar 15 '11 at 14:09
  • Then the questions stands, how do I create a cryptographic hash on an object model? – bitbonk Mar 15 '11 at 16:51
  • 1
    @bitbonk: I would strongly consider using a "normal" cryptographic hash on the result of binary serialization of form. – Jon Skeet Mar 15 '11 at 17:55
  • 1
    This algorithm is basically the DJB2 string hashing algorithm, for which the constants 5381 and 33 are recommended (http://www.cse.yorku.ca/~oz/hash.html). To be honest I'm not sure the constant makes much difference, but the multiplier is important. – yoyo Dec 16 '11 at 18:56
  • @JonSkeet I realize I'm raising the dead here, but implementing hashes is a new thing for me. In your implementation, what fields am I including in the hash? Only immutable ones, or are any fields good? – KChaloux Oct 25 '12 at 22:06
  • 2
    @KChaloux: That entirely depends on what you want equality to mean. Normally it's a bad idea to include mutable data though. – Jon Skeet Oct 25 '12 at 22:08
  • 2
    How would you handle nullity? If simply ignoring that field, then for A = null, B = "ss" and for A = "ss", B = null we would have colisions. Is't better to multiply each field with different prime number? – Vajda Jan 22 '13 at 16:33
  • 25
    @Vajda: I usually use 0 as the effective hash code for `null` - which isn't the same as ignoring the field. – Jon Skeet Jan 22 '13 at 16:49
  • @jnm2: I don't follow your argument, to be honest. In particular, I've just tried this effectively hashing 10 fields - and changing the value of *just* the first field still changed the hash, contradicting your claim that "every bit of the first hash codes will be lost". – Jon Skeet Nov 20 '13 at 14:50
  • You can demonstrate this gives a poor distribution quite simply. Take this variant of FNV and apply it to strings (use unsafe pointer manipulation to get integers at a time to give it a fair chance). Use it to add strings to a power-of-two based hash table. With the one I'm working on now, if I generate "1", "2", ... "999999" and add them in, it takes about 34 seconds. Now take this same hash method and re-hash the result with a well-distributed hash. With a good hash this can only make things worse (more time spent, and we could introduce new collisions but never remove them). With the... – Jon Hanna Jan 14 '14 at 10:23
  • ... same hash table I'm working on, the same code to generate "1"..."999999" and add them takes 1 second. The effect is less pronounced with prime-number based hashes, so it that case the extra time spent re-hashing (and perhaps the reduction in possible outcomes, though that's unlikely) doesn't gain anything, but the poor performance on power-of-two tables demonstrates poor distribution generally. – Jon Hanna Jan 14 '14 at 10:31
  • @JonHanna: Thanks for that. Not sure what you mean by "to get integers at a time" but I'll try to have a closer look. I still like this as a first approximation for a hash, but if you have another hash which is as simple to remember and get right but with a better distribution, I'd be very happy to change my practice :) – Jon Skeet Jan 14 '14 at 11:10
  • I meant I used `fixed(char* ptr = str){int* iPtr = (int*)ptr;...` but I also tried just doing `foreach(char c in str)` and casting each `char` to `int`, and the same applies. The relative weakness became apparent to me when I had reason to use power-of-two tables and was getting poor results (I used to use much the same as above, myself). The solution I finally hit upon is to forget about being easy to remember, and to build a hard-to-remember method once, and then make it easy to use and put it the code at https://www.nuget.org/packages/SpookilySharp I'll add a full answer here at lunchtime. – Jon Hanna Jan 14 '14 at 11:24
  • @JonSkeet and now answered. – Jon Hanna Jan 14 '14 at 14:33
  • @JonHanna: Thanks for that. Will have to look in more detail when I have a bunch of time :) – Jon Skeet Jan 14 '14 at 14:51
  • 1
    I think is important to point out that we should be careful with the Hash code changing at run-time. We had a bug in my project because the previous developer had implemented a GetHashCode algorithm based in this answer. But in his implementation he had a list of objects, he used the hash of each item in the collection to generate the object hash code. Therefore, when the collection changed, the hash code changed also. That was causing binding problems in WPF. And if you had the object in a dictionary for example, you would get errors also. – Dzyann Feb 14 '14 at 16:11
  • 1
    @Dzyann: Yes, mutating the key in a way which affects equality - and thus the hash code - it always a bad idea. I'll add a note. – Jon Skeet Feb 14 '14 at 16:59
  • @JonSkeet you are right, and it can lead to very hard to track bugs. Like in this case with the bindings of WPF. It took ages until one of my coworkers found the cause of it and solved it. Since It wasn't our code it was very challenging. – Dzyann Feb 14 '14 at 18:32
  • I would suggest that you change 17 and 23 to the constants [here](http://eternallyconfuzzled.com/tuts/algorithms/jsw_tut_hashing.aspx#fnv). (Thanks for the link.) It gave a simple dictionary lookup much higher performance, in my case, ~60% better. – jnm2 Apr 23 '14 at 17:04
  • @jnm2: That's not the same algorithm to start with - it's using XOR rather than ADD. I'll stick with these constants for this answer, but perhaps you should add your own answer? – Jon Skeet Apr 23 '14 at 17:07
  • Actually, I was going to suggest that xoring rather than adding wouldn't diminish the simplicity as a go-to hash algorithm. What do you think? – jnm2 Apr 23 '14 at 17:08
  • XOR ends up making GetHashCode() faster by 12% in my case. – jnm2 Apr 23 '14 at 17:17
  • 1
    @jnm2: Well it wouldn't diminish that simplicity - but it's not what I've done for the past several years. I'll add FNV as an alternative though. – Jon Skeet Apr 23 '14 at 17:38
  • `int hash = 2166136261;` Is there a cast missing? Compiler says that `2166136261` is a `uint`... I changed it to `int hash = (int)2166136261;` – Roman Ganz Apr 24 '14 at 12:52
  • I tried to implement this approach for [ValueUtils](https://github.com/EamonNerbonne/ValueUtils), but in my testing this variant of FNV caused considerable collisions (24%) in certain symmetrical datasets. And perhaps that's because this is NOT actually the FNV hash? Tradutional FNV hashes per octet (byte), not per 32-bit word. That gives this variant less opportunity to mix this bits around... – Eamon Nerbonne Jun 01 '14 at 14:15
  • @EamonNerbonne: What do you mean by "this approach"? The answer now contains two different versions... – Jon Skeet Jun 01 '14 at 15:53
  • I mean this variant of FNV - it's not quite FNV, and I'm pretty sure that makes matters worse. Incidentally, I also tried the `h=prime; repeat h=h*prime + ?` recipe; that seems to vary; it does quite well for larger primes, especially if your intermediate is 64-bit wide. – Eamon Nerbonne Jun 01 '14 at 19:27
  • @Eamon: I don't know enough about the theory to comment further, I'm afraid :( – Jon Skeet Jun 01 '14 at 19:50
  • Yeah, the theory behind it is not at all obvious to me. However - this answer suggests that this implementation is FNV, a well-known good hash. But that's not really true, since this is *not* FNV. Also, FNV is a string hash algorithm, which needs to satisfy much trickier requirements in that it needs to work for variable length potentially long strings. But again, the algorithm currently in the answer is not FNV - it mixes the bits much less well. – Eamon Nerbonne Jun 01 '14 at 19:58
  • @EamonNerbonne: Okay. I'll edit to indicate that it's a modification, and that it doesn't work as well in at least some cases. – Jon Skeet Jun 01 '14 at 19:59
  • @EamonNerbonne: What are the best coefficients that you know of? – jnm2 Jun 03 '14 at 12:09
  • @jnm2 In my experiments, the offset matters little, and the trend is that larger primes perform better, with the caveat that this is all tricky to test because it's slow (really slow) to be thorough, and it depends on the way in which your dataset is "messed up". If your fields have perfectly randomly distributed hashcodes - none of this matters, but of course, in the real world, those hash codes aren't random, and fields are correlated. There's a fairly good reason why large primes would be better too - they mix bits better, especially if your data consists largely of small numbers. – Eamon Nerbonne Jun 03 '14 at 13:38
  • 1
    @jnm2, so I'd pick a largish prime (on the order of 2^16, say) and to tune for .NET's dictionary implementation, one that's NOT used by Dictionary<,>: http://referencesource.microsoft.com/#mscorlib/system/collections/hashtable.cs#19337ead89202585 – Eamon Nerbonne Jun 03 '14 at 13:45
  • 1
    @jnm2 I came across these two questions further exploring this issue: http://stackoverflow.com/questions/1835976/what-is-a-sensible-prime-for-hashcode-calculation and http://stackoverflow.com/questions/1145217/why-should-hash-functions-use-a-prime-number-modulus and both conclude: use any old large prime. The accepted answer in the first question mentions two chosen in a principled way - but it's not likely a principle that really relates to the real world, so it still recommends the basic idea: pick a large prime, NOT 23 or 31. – Eamon Nerbonne Jun 04 '14 at 15:48
  • BTW: note that the offset is (as far as I can tell) completely pointless. Distributive laws also hold under modulo, and that means it's simply an identical offset all objects will share - that certainly has no impact on any hashtable I I know. – Eamon Nerbonne Jun 04 '14 at 15:52
  • @EamonNerbonne: I guess that's true if all the objects are of the same type. If you've got a dictionary where some keys are subclasses of other keys, that could make a difference... although only when the extra field values are 0 anyway. Again, this is mostly just habit for me :( – Jon Skeet Jun 04 '14 at 16:01
  • @JonSkeet Yeah, if you have objects of differing type and you use differing offsets, you'll have some advantage. No reason to be prime, though, I think... In any case, an addition is so cheap, there's not much reason to avoid it either. – Eamon Nerbonne Jun 04 '14 at 16:04
  • I used this algorithm for a pseudo-random generator, and it behaves a little strangely: http://stackoverflow.com/questions/26847262/why-doesnt-this-repeatable-random-algorithm-work – Max Yankov Nov 10 '14 at 15:35
  • 4
    If you got the number 486187739 from http://stackoverflow.com/a/2816747/21499 - I actually intended to recommend 92821. – Dr. Hans-Peter Störr Apr 01 '15 at 11:35
  • As each instance of class 'object' has a unique hash code, this idea came to my mind that it would be good if we use base.GetHashCode() as a seed or something to produce our hash code for the object. – Ahmad Siavosh Aug 05 '15 at 08:06
  • 4
    @AhmadSiavosh: No, that's a *bad* idea, because you want distinct but equal objects to have the same hash code. (I don't think object.GetHashCode is guaranteed to be unique, either. It may well be "very unlikely to collide" but that's not the same thing.) – Jon Skeet Aug 05 '15 at 08:13
  • If a `fieldL`is a `List`, it will work by just doing `hash = ... ^ fieldL.GetHashCode()` or I have to go through the items like `foreach(){hash = ... ^ item.GetHashCode()}`??? – Jaider Feb 12 '16 at 16:14
  • @Jaider: It won't do either. `List` doesn't override `Equals` or `GetHashCode`.# – Jon Skeet Feb 12 '16 at 16:29
  • I tried this code for 3 doubles, and I get an enormous amount of collisions. I need to get hashcodes for 4194304 tuples. Is there a better way? Using some larger primes helped a little bit, but I am still getting collisions. – dmarra Feb 16 '16 at 00:58
  • @user984444: Well you should expect quite a few collisions with that many entries. Just how many are you getting? – Jon Skeet Feb 16 '16 at 06:07
  • @JonSkeet It's hard to say. I am using this to cache the output of some perlin noise, and the indicator of collision is some "intersting" output in my image; It has the appearance of... when you win solitaire. This gets mitigated (and the patern changes) with larger primes. Very unhelpful I know. I have changed my struct (tuple of doubles as the key) into a class so that net takes care of the hashcode for me, and it has no collisions anymore. – dmarra Feb 16 '16 at 15:43
  • @user984444: Um, that way equal keys won't be equal though, unless you've overridden `GetHashCode` in your class, in which case you've got the same problem. Maybe it would be worth you asking a new question with all the details... – Jon Skeet Feb 16 '16 at 15:44
  • @JonSkeet: Not true; the default implementation of GetHashCode is working perfectly well (If it wasn't, it would be incredibly obvious in my end result). It works for a struct as well, but is WOEFULLY SLOW. I wanted to use structs, but using a class seems to work just fine for my use case. – dmarra Feb 16 '16 at 20:02
  • 1
    @user984444: Unless you override `GetHashCode` and `Equals` yourself, or inherit from another class which does so, you will get reference equality. That's *not* what the struct will give you. It really, really sounds like we need a new post here with details. – Jon Skeet Feb 16 '16 at 20:43
  • @JonSkeet: I consider my particular problem solved because I am getting the desired result, but if I get a chance, I'll post a question with details so you can see what is going on. – dmarra Feb 17 '16 at 02:04
  • 1
    being really picky, StyleCop default settings generates a warning for this code (SA1407) as you have not used parentheses to determine the precedence of the arithmetic operators, even though it is clean to any dev reading the code, and the compiler as we all know the BODMAS rule. – MikeW Mar 30 '16 at 09:32
  • 1
    @MikeW: I don't think BODMAS includes XOR :) I think the final piece of code would be clearer with parentheses - will add them now. I agree that the multiply-and-add version doesn't need them. – Jon Skeet Mar 30 '16 at 09:34
  • 4
    For future readers: Consider using [`HashCode.Combine()`](https://github.com/dotnet/coreclr/pull/14863) – RobIII Nov 23 '17 at 20:45
  • @JonSkeet any idea how to go about this in t-sql ? i need c# hash of guid series to match t-sql hash of uniqueidentifier series. but afaik in t-sql it's not possible to wrap results of integer-arithmetic. – BaltoStar Mar 01 '18 at 22:22
  • @BaltoStar: I don't know anything about hashing in T-SQL. If it already provides well-defined hashing for GUID values, I'd probably try to emulate that in C# rather than the other way round. – Jon Skeet Mar 02 '18 at 06:57
  • @JonSkeet in C# why not just MD5 hash the ordered concatenation of the GUIDs ? – BaltoStar Mar 02 '18 at 12:41
  • 1
    @JamesKo: I'll add a link to `HashCode.Combine` when .NET Core 2.1 has actually been released, and I can link to docs. I don't think it'll be useful to many people until that time. – Jon Skeet Mar 16 '18 at 07:21
  • @JonSkeet Sure. – James Ko Mar 16 '18 at 22:07
  • 1
    I'm unsure how to handle the nulls here. None of the answers really seem to touch that topic, just assuming we are all experts on the topic. @JonSkeet Mentions in these comments "I usually use 0 as the effective hash code for null - which isn't the same as ignoring the field." How that is actually implemented though, I have questions about. It sounds like you are saying that a null property should zero out the current hash value, but that seems like odd behavior. It might be obvious to some what to do, but I'd appreciate an example showing how to handle the nulls, or a better explanation. – crush May 07 '18 at 14:08
  • 1
    After reading a few other Q&A's on the topic, I realized that I didn't comprehend what @JonSkeet was saying very well. I misunderstood him to be saying that I should substitute 0 as the hashing constant when the property is null. After seeing an example [here](https://stackoverflow.com/a/31220250/1195273), I realize he was merely stating that I should substitute 0 as the property's hash code, which seems so obvious now...considering that is exactly what he said. – crush May 07 '18 at 14:18
  • Does it really need to use primes like 17 or 23 if my object's hash only depends on one `int32` property? Can I just return `MyProperty.GetHashCode()`? – dragonfly02 May 14 '18 at 17:03
  • @stt106: For just a single property, I'd just return that property's hash code, yes. – Jon Skeet May 14 '18 at 17:04
  • 4
    FYI, Visual Studio 2017 can generate `GetHashCode()` without ReSharper. https://learn.microsoft.com/en-us/visualstudio/ide/reference/generate-equals-gethashcode-methods?view=vs-2017 – cactuaroid Oct 27 '18 at 13:59
  • Why multiply hash on every line? Why: `int hash = 17; hash = hash * 23 + ...` ? Why not just use the product explicitly, as in, `hash = 391 + field1.GetHashCode();` ? Since order of operations will do the multiplication first anyway? – emery.noel Oct 08 '19 at 13:06
  • 2
    @emery.noel: That won't make any difference after the first line (you'd still need to multiply, to include the previous hash) and there's a big benefit IMO in making every line consistent. – Jon Skeet Oct 08 '19 at 13:58
  • 1
    Not much notice has been taken of an important point. It is Important that the returned hash code NOT CHANGE if the object is mutable and the object changes. This is because the hash code is used (for example) to place objects in dictionaries. If a mutable object changes after insertion in a dictionary then the object is not found when you go to look it up. The above should cache the hash the first time it is computed and always return the original value. Otherwise you will have strange bugs. – Tb. Apr 08 '20 at 17:11
  • 2
    @Tb.: Or you document it, as per the docs: "If you do choose to override GetHashCode() for a mutable reference type, your documentation should make it clear that users of your type should not modify object values while the object is stored in a hash table." Often that's a useful thing to do, as you might choose to construct an object but not mutate it afterwards. It's not "squeaky clean" but it can be perfectly practical. – Jon Skeet Apr 08 '20 at 17:46
  • 2
    The link to the FNV article is broken but I found it in the archive: https://archive.vn/KJeJy – Jalal Feb 19 '21 at 07:54
  • @JonSkeet Which version of GetHashCode() is better, the one using 17/23 or the one using long numbers? Anyone can help out? – Pingpong Aug 20 '21 at 00:11
  • @Pingpong: "Better" has any number of axes, and will depend on the source data. I don't even know which one you mean by "the one using long numbers" (there are likely to be several, and I'm not going to trawl over ever answer here finding candidates). I would personally just start with the simple one here - or use `HashCode.Combine` if you can. – Jon Skeet Aug 20 '21 at 06:06
547

ValueTuple - Update for C# 7

As @cactuaroid mentions in the comments, a value tuple can be used. This saves a few keystrokes and more importantly executes purely on the stack (no Garbage):

(PropA, PropB, PropC, PropD).GetHashCode();

(Note: The original technique using anonymous types seems to create an object on the heap, i.e. garbage, since anonymous types are implemented as classes, though this might be optimized out by the compiler. It would be interesting to benchmark these options, but the tuple option should be superior.)

Anonymous Type (Original Answer)

Microsoft already provides a good generic HashCode generator: Just copy your property/field values to an anonymous type and hash it:

new { PropA, PropB, PropC, PropD }.GetHashCode();

This will work for any number of properties. It does not use boxing. It just uses the algorithm already implemented in the framework for anonymous types.

Rick Love
  • 12,519
  • 4
  • 28
  • 27
  • 90
    Yes, anonymous `GetHashCode` implementation is very effective (BTW it's the same as the one in the Jon Skeet's answer), but the only problem with this solution is that you generate a new instance at any `GetHashCode` call. It can be a bit overhead-ish in particular in case of intensive access to big hashed collections... – digEmAll Jan 08 '11 at 09:50
  • 3
    This works in VB w/ .NET 4.0, but looking over the IL, it's using `box` calls since the type uses generics. No unboxing, but from my reading here, the mere presence of boxing suggests this can be a little inefficient. Seems like the only choice for VB, though, since there is not equivalent of `checked`/`unchecked'. – Kumba Jan 11 '11 at 09:37
  • 5
    @digEmAll Good point, I didn't think about the overhead of creating an new object. Jon Skeet's answer is the most efficient and won't use boxing. (@Kumba To solve the unchecked in VB, just use a Int64 (long) and truncate it after the calculations.) – Rick Love Apr 02 '11 at 17:30
  • 2
    In VB.Net: `New With {PropA, PropB, PropC, PropD}.GetHashCode()` – mwolfe02 Apr 16 '13 at 15:40
  • 20
    VB.NET must use Key in anonymous type creation: `New With {Key PropA}.GetHashCode()` Otherwise GetHashCode will not return the same hashcode for different objects with the same 'identifying' properties. – David Osborne Aug 20 '14 at 15:58
  • 2
    Don't forget to enumerate your IEnumerables or bad things will happen. `new { PropA, PropB, C = PropC.ToList() }.GetHashCode()` – Keith Oct 19 '15 at 20:16
  • 4
    @Keith in that case, I would consider saving the IEnumerable as a list value somewhere instead of enumerating it each time the hashcode is calculated. Caclulating ToList each time inside GetHashCode could hurt performance in many situations. – Rick Love Oct 20 '15 at 20:40
  • And don't forget that private property/fields are not needed in this case ;). – shA.t Aug 29 '17 at 08:22
  • 2
    @Keith: A hash code does not have to be affected by *all* properties of an object. A hash code merely needs to give *good enough* distribution of your objects. And should be *fast* to compute. Leave out the enumerable. And if you have a list, don't include the whole list. Use `Count`, and perhaps first element (use zero if no elements). unless your class doesn't have much variation other than the list; in that case, caching a hash of the list, as Rick suggests, is best. Recall that, by definition, an object's hash must always be the same. If collection changes, DO NOT include it in hash calc. – ToolmakerSteve Mar 01 '18 at 04:15
  • 8
    For those who like this, `(PropA, PropB, PropC, PropD).GetHashCode()` is now available on C#7 without GC pressure @digEmAll concerns. [Quick and Simple Hash Code Combinations](https://stackoverflow.com/questions/1646807/quick-and-simple-hash-code-combinations/47480816#47480816) – cactuaroid Aug 16 '18 at 11:59
  • @cactuaroid Nice! So using a tuple (which is a struct) instead of the anonymous type (a class). Does it still use the same calculation behind the scenes for the Tuple GetHashcode()? – Rick Love Aug 16 '18 at 13:55
  • 1
    @RickLove I'm not sure the mathematics. Tuple.GetHashCode() and ValueTuple.GetHashCode() looks similar. ValueTuple.GetHashCode() calls [HashHelper](https://github.com/dotnet/corefx/blob/master/src/Common/src/System/Numerics/Hashing/HashHelpers.cs). Tuple.GetHashCode() calls [Tuple.CombineHashCodes](https://referencesource.microsoft.com/#mscorlib/system/tuple.cs,49b112811bc359fd). For anonymous type, [How are Equals and GetHashCode implemented on anonymous types?](https://stackoverflow.com/questions/38558795/how-are-equals-and-gethashcode-implemented-on-anonymous-types) – cactuaroid Aug 16 '18 at 15:03
  • I apologize that @Timo has already posted about ValueTuple.GetHashCode() [below](https://stackoverflow.com/questions/263400/what-is-the-best-algorithm-for-an-overridden-system-object-gethashcode/50349672#50349672). – cactuaroid Aug 17 '18 at 14:08
  • If nullable, do we need to check PropA, B, etc for null and if null pass in 0? – ScubaSteve Jun 28 '21 at 18:25
  • How would you do a nullcheck on an optional property here? – KarmaEDV Oct 14 '21 at 09:07
144

Using System.HashCode

If you are using .NET Standard 2.1 or above, you can use the System.HashCode struct. On earlier frameworks it is available from the Microsoft.Bcl.HashCode package. There are two methods of using it:

HashCode.Combine

The Combine method can be used to create a hash code, given up to eight objects.

public override int GetHashCode() => HashCode.Combine(this.object1, this.object2);

HashCode.Add

The Add method helps you to deal with collections:

public override int GetHashCode()
{
    var hashCode = new HashCode();
    hashCode.Add(this.object1);
    foreach (var item in this.collection)
    {
        hashCode.Add(item);
    }
    return hashCode.ToHashCode();
}

GetHashCode Made Easy

An alternative to System.HashCode that is super easy to use while still being fast. You can read the full blog post 'GetHashCode Made Easy' for more details and comments.

Usage Example

public class SuperHero
{
    public int Age { get; set; }
    public string Name { get; set; }
    public List<string> Powers { get; set; }

    public override int GetHashCode() =>
        HashCode.Of(this.Name).And(this.Age).AndEach(this.Powers);
}

Implementation

public struct HashCode : IEquatable<HashCode>
{
    private const int EmptyCollectionPrimeNumber = 19;
    private readonly int value;

    private HashCode(int value) => this.value = value;

    public static implicit operator int(HashCode hashCode) => hashCode.value;

    public static bool operator ==(HashCode left, HashCode right) => left.Equals(right);

    public static bool operator !=(HashCode left, HashCode right) => !(left == right);

    public static HashCode Of<T>(T item) => new HashCode(GetHashCode(item));

    public static HashCode OfEach<T>(IEnumerable<T> items) =>
        items == null ? new HashCode(0) : new HashCode(GetHashCode(items, 0));

    public HashCode And<T>(T item) => 
        new HashCode(CombineHashCodes(this.value, GetHashCode(item)));

    public HashCode AndEach<T>(IEnumerable<T> items)
    {
        if (items == null)
        {
            return new HashCode(this.value);
        }

        return new HashCode(GetHashCode(items, this.value));
    }

    public bool Equals(HashCode other) => this.value.Equals(other.value);

    public override bool Equals(object obj)
    {
        if (obj is HashCode)
        {
            return this.Equals((HashCode)obj);
        }

        return false;
    }

    public override int GetHashCode() => this.value.GetHashCode();

    private static int CombineHashCodes(int h1, int h2)
    {
        unchecked
        {
            // Code copied from System.Tuple a good way to combine hashes.
            return ((h1 << 5) + h1) ^ h2;
        }
    }

    private static int GetHashCode<T>(T item) => item?.GetHashCode() ?? 0;

    private static int GetHashCode<T>(IEnumerable<T> items, int startHashCode)
    {
        var temp = startHashCode;

        var enumerator = items.GetEnumerator();
        if (enumerator.MoveNext())
        {
            temp = CombineHashCodes(temp, GetHashCode(enumerator.Current));

            while (enumerator.MoveNext())
            {
                temp = CombineHashCodes(temp, GetHashCode(enumerator.Current));
            }
        }
        else
        {
            temp = CombineHashCodes(temp, EmptyCollectionPrimeNumber);
        }

        return temp;
    }
}

What Makes a Good Algorithm?

Performance

The algorithm that calculates a hash code needs to be fast. A simple algorithm is usually going to be a faster one. One that does not allocate extra memory will also reduce need for garbage collection, which will in turn also improve performance.

In C# hash functions specifically, you often use the unchecked keyword which stops overflow checking to improve performance.

Deterministic

The hashing algorithm needs to be deterministic i.e. given the same input it must always produce the same output.

Reduce Collisions

The algorithm that calculates a hash code needs to keep hash collisions to a minumum. A hash collision is a situation that occurs when two calls to GetHashCode on two different objects produce identical hash codes. Note that collisions are allowed (some have the misconceptions that they are not) but they should be kept to a minimum.

A lot of hash functions contain magic numbers like 17 or 23. These are special prime numbers which due to their mathematical properties help to reduce hash collisions as compared to using non-prime numbers.

Hash Uniformity

A good hash function should map the expected inputs as evenly as possible over its output range i.e. it should output a wide range of hashes based on its inputs that are evenly spread. It should have hash uniformity.

Prevent's DoS

In .NET Core each time you restart an application you will get different hash codes. This is a security feature to prevent Denial of Service attacks (DoS). For .NET Framework you should enable this feature by adding the following App.config file:

<?xml version ="1.0"?>  
<configuration>  
   <runtime>  
      <UseRandomizedStringHashAlgorithm enabled="1" />  
   </runtime>  
</configuration>

Because of this feature, hash codes should never be used outside of the application domain in which they were created, they should never be used as key fields in a collection and they should never be persisted.

Read more about this here.

Cryptographically Secure?

The algorithm does not have to be a Cryptographic hash function. Meaning it does not have to satisfy the following conditions:

  • It is infeasible to generate a message that yields a given hash value.
  • It is infeasible to find two different messages with the same hash value.
  • A small change to a message should change the hash value so extensively that the new hash value appears uncorrelated with the old hash value (avalanche effect).
Muhammad Rehan Saeed
  • 35,627
  • 39
  • 202
  • 311
  • 4
    This is very good answer. As an addition, you could consider changing "speed" to "performance" and adding the property of being allocation-free. The built-in `HashCode` type satisfies that too. – Timo Jul 10 '20 at 15:22
  • How does this compare to the `ValueTuple.GetHashCode()` answer recently updated by @ricklove above? – Thiago Silva Feb 18 '21 at 03:10
  • 3
    The `HashCode.Combine` is a static method which will not allocate anything, while `ValueTuple` will start with allocating on the stack. – Muhammad Rehan Saeed Feb 18 '21 at 08:35
  • 3
    `HashCode.Of(this.Name).And(this.Age).AndEach(this.Powers)` - that is nice syntax :) – Amos Egel Mar 09 '21 at 08:14
  • `they should never be used as key fields in a collection`, Isn't that the whole point of hash codes though? And the existence of hash tables, hash sets, dictionaries? – maraaaaaaaa Dec 30 '21 at 20:54
  • Very well done, but two points currently not mentioned: First, the collection comparer respects order (so [1,2,3] has a different hash code than [3,2,1]), this can be desired, but not always, so it should be made clear at least in the documentation. Second, the enumerable will be completely iterated and the elements are materialized. Can lead to problems if the source is an endless enumeration or elements are created on the fly. Same here, should be mentioned in the docs. – Oliver Nov 18 '22 at 07:26
111

Here is my hashcode helper.
It's advantage is that it uses generic type arguments and therefore will not cause boxing:

public static class HashHelper
{
    public static int GetHashCode<T1, T2>(T1 arg1, T2 arg2)
    {
         unchecked
         {
             return 31 * arg1.GetHashCode() + arg2.GetHashCode();
         }
    }

    public static int GetHashCode<T1, T2, T3>(T1 arg1, T2 arg2, T3 arg3)
    {
        unchecked
        {
            int hash = arg1.GetHashCode();
            hash = 31 * hash + arg2.GetHashCode();
            return 31 * hash + arg3.GetHashCode();
        }
    }

    public static int GetHashCode<T1, T2, T3, T4>(T1 arg1, T2 arg2, T3 arg3, 
        T4 arg4)
    {
        unchecked
        {
            int hash = arg1.GetHashCode();
            hash = 31 * hash + arg2.GetHashCode();
            hash = 31 * hash + arg3.GetHashCode();
            return 31 * hash + arg4.GetHashCode();
        }
    }

    public static int GetHashCode<T>(T[] list)
    {
        unchecked
        {
            int hash = 0;
            foreach (var item in list)
            {
                hash = 31 * hash + item.GetHashCode();
            }
            return hash;
        }
    }

    public static int GetHashCode<T>(IEnumerable<T> list)
    {
        unchecked
        {
            int hash = 0;
            foreach (var item in list)
            {
                hash = 31 * hash + item.GetHashCode();
            }
            return hash;
        }
    }

    /// <summary>
    /// Gets a hashcode for a collection for that the order of items 
    /// does not matter.
    /// So {1, 2, 3} and {3, 2, 1} will get same hash code.
    /// </summary>
    public static int GetHashCodeForOrderNoMatterCollection<T>(
        IEnumerable<T> list)
    {
        unchecked
        {
            int hash = 0;
            int count = 0;
            foreach (var item in list)
            {
                hash += item.GetHashCode();
                count++;
            }
            return 31 * hash + count.GetHashCode();
        }
    }

    /// <summary>
    /// Alternative way to get a hashcode is to use a fluent 
    /// interface like this:<br />
    /// return 0.CombineHashCode(field1).CombineHashCode(field2).
    ///     CombineHashCode(field3);
    /// </summary>
    public static int CombineHashCode<T>(this int hashCode, T arg)
    {
        unchecked
        {
            return 31 * hashCode + arg.GetHashCode();   
        }
    }

Also it has extension method to provide a fluent interface, so you can use it like this:

public override int GetHashCode()
{
    return HashHelper.GetHashCode(Manufacturer, PartN, Quantity);
}

or like this:

public override int GetHashCode()
{
    return 0.CombineHashCode(Manufacturer)
        .CombineHashCode(PartN)
        .CombineHashCode(Quantity);
}
casperOne
  • 73,706
  • 19
  • 184
  • 253
nightcoder
  • 13,149
  • 16
  • 64
  • 72
  • 5
    No need for `T[]` separately as it is already `IEnumerable` – nawfal Apr 14 '13 at 12:43
  • 5
    You could refactor those methods and restrict the core logic to one function – nawfal Apr 14 '13 at 13:06
  • 14
    Incidentally, 31 is a shift and subtract on the CPU, which is exceedingly fast. – Chui Tey Aug 22 '13 at 23:14
  • 1
    An extension method in int is nasty namespace pollution - @safak-gur 's [answer below](http://stackoverflow.com/a/18613926/42921) nicely side-steps that problem. – Eamon Nerbonne Jun 01 '14 at 19:32
  • 5
    @nightcoder you could use [params](https://msdn.microsoft.com/en-us/library/w5zay9db.aspx). – ANeves Feb 09 '15 at 13:54
  • 6
    @ChuiTey This is something all [Mersenne Primes](http://en.wikipedia.org/wiki/Mersenne_prime) have in common. – Pharap Jun 12 '15 at 03:11
  • shouldn't the `hash` variable start with a non-zero? https://stackoverflow.com/a/113600/9638388 – geekley Jul 20 '18 at 18:11
  • 1
    Just because it's cool, you can also do this with an one-liner: ```source?.Aggregate(0, (current, item) => unchecked(current * 31 + (item?.GetHashCode() ?? 0))) ?? 0;``` – KnorxThieus Mar 09 '19 at 15:54
  • 1
    @ANeves I suggest you dont use `params` if it is meant for wider use (like a public library). `params` involve an array allocation (plus the O(n) cost of populating the array) which is bad for perf sensitive situations. `params object[]` is doubly bad now that you introduce boxing cost as well for value types. – nawfal May 06 '20 at 15:48
67

I have a Hashing class in Helper library that I use it for this purpose.

/// <summary> 
/// This is a simple hashing function from Robert Sedgwicks Hashing in C book.
/// Also, some simple optimizations to the algorithm in order to speed up
/// its hashing process have been added. from: www.partow.net
/// </summary>
/// <param name="input">array of objects, parameters combination that you need
/// to get a unique hash code for them</param>
/// <returns>Hash code</returns>
public static int RSHash(params object[] input)
{
    const int b = 378551;
    int a = 63689;
    int hash = 0;

    // If it overflows then just wrap around
    unchecked
    {
        for (int i = 0; i < input.Length; i++)
        {
            if (input[i] != null)
            {
                hash = hash * a + input[i].GetHashCode();
                a = a * b;
            }
        }
    }

    return hash;
}

Then, simply you can use it as:

public override int GetHashCode()
{
    return Hashing.RSHash(_field1, _field2, _field3);
}

I didn't assess its performance, so any feedback is welcomed.

Mike Chamberlain
  • 39,692
  • 27
  • 110
  • 158
Wahid Shalaly
  • 2,047
  • 1
  • 18
  • 29
  • 29
    Well, it will cause boxing, if fields are value types. – nightcoder Apr 04 '10 at 15:39
  • 7
    "can be enhanced later by catching the OverflowException" The whole point of the `unchecked` is to avoid exceptions on overflow which is desired on `GetHashCode`. So it's not incorrect if the value overflows `int` and it does not hurt at all. – Tim Schmelter Feb 24 '14 at 13:06
  • 2
    One issue with this algorithm is that any array full of nulls will always return 0, regardless of it's length – Nathan Adams Apr 17 '15 at 12:12
  • 3
    This helper method also allocates a new object[] – James Newton-King Jul 20 '16 at 12:35
  • 2
    As @NathanAdams mentions, the fact that `null` is skipped entirely could give you unexpected results. Instead of skipping them, you should just use some constant value instead of `input[i].GetHashCode()` when `input[i]` is null. – David Schwartz Oct 28 '16 at 19:04
59

Here's my helper class using Jon Skeet's implementation.

public static class HashCode
{
    public const int Start = 17;

    public static int Hash<T>(this int hash, T obj)
    {
        var h = EqualityComparer<T>.Default.GetHashCode(obj);
        return unchecked((hash * 31) + h);
    }
}

Usage:

public override int GetHashCode()
{
    return HashCode.Start
        .Hash(_field1)
        .Hash(_field2)
        .Hash(_field3);
}

If you want to avoid writing an extension method for System.Int32:

public readonly struct HashCode
{
    private readonly int _value;

    public HashCode(int value) => _value = value;

    public static HashCode Start { get; } = new HashCode(17);

    public static implicit operator int(HashCode hash) => hash._value;

    public HashCode Hash<T>(T obj)
    {
        var h = EqualityComparer<T>.Default.GetHashCode(obj);
        return unchecked(new HashCode((_value * 31) + h));
    }

    public override int GetHashCode() => _value;
}

It still avoids any heap allocation and is used exactly the same way:

public override int GetHashCode()
{
    // This time `HashCode.Start` is not an `Int32`, it's a `HashCode` instance.
    // And the result is implicitly converted to `Int32`.
    return HashCode.Start
        .Hash(_field1)
        .Hash(_field2)     
        .Hash(_field3);
}

Edit (May 2018): EqualityComparer<T>.Default getter is now a JIT intrinsic - the pull request is mentioned by Stephen Toub in this blog post.

Şafak Gür
  • 7,045
  • 5
  • 59
  • 96
  • 1
    I would change the line with the tertiary operator to be: `var h = Equals(obj, default(T)) ? 0 : obj.GetHashCode();` – Bill Barry Sep 05 '14 at 17:12
  • I believe that the ternary operator with `obj != null` will compile to a `box` instruction which will allocate memory if `T` is a value type. Instead you can use `obj.Equals(null)` which will compile to a virtual call of the `Equals` method. – Martin Liversage Sep 13 '14 at 23:00
  • Because `this.hashCode != h`. It wouldn't return the same value. – Şafak Gür Jun 15 '15 at 08:01
  • Sorry, manage to remove my comment instead of editing it. Is it more beneficial to create a new struct then change the hashCode to non-readonly and do: "unchecked { this.hashCode ^= h * 397; } return this;" for example? – Erik Karlsson Jun 15 '15 at 08:28
  • Immutability has its benefits ([Why are mutable structs evil?](http://stackoverflow.com/a/441357/704144)). About performance, what I do is pretty cheap since it does not allocate any space in the heap. – Şafak Gür Jun 15 '15 at 10:35
  • There is no boxing if you call it like Hash(1) and not like Hash(myStruct). http://stackoverflow.com/questions/8823239 – user764754 Apr 11 '16 at 18:12
31

In most cases where Equals() compares multiple fields it doesn't really matter if your GetHash() hashes on one field or on many. You just have to make sure that calculating the hash is really cheap (No allocations, please) and fast (No heavy computations and certainly no database connections) and provides a good distribution.

The heavy lifting should be part of the Equals() method; the hash should be a very cheap operation to enable calling Equals() on as few items as possible.

And one final tip: Don't rely on GetHashCode() being stable over multiple aplication runs. Many .Net types don't guarantee their hash codes to stay the same after a restart, so you should only use the value of GetHashCode() for in memory data structures.

Bert Huijben
  • 19,525
  • 4
  • 57
  • 73
  • 12
    "In most cases where Equals() compares multiple fields it doesn't really matter if your GetHash() hashes on one field or on many." This is dangerous advice, because for objects which only differ in the un-hashed fields, you will get hash collisions. If this happens frequently, the performance of hash-based collections (HashMap, HashSet etc.) will degrade (up to O(n) in the worst case). – sleske Apr 15 '10 at 15:44
  • 12
    This actually happened in Java: In early versions of the JDK String.hashCode() only considered the beginning of the string; this lead to performance problems if you used Strings as keys in HashMaps which only differed at the end (which is common e.g. for URLs). The algorithm was therefore changed ( in JDK 1.2 or 1.3 I believe). – sleske Apr 15 '10 at 15:51
  • 4
    If that one field 'provides a good distribution' (last part of my answer), then one field is enough.. If it **doesn't provide a good distribution**, then (and just then) you need another calculation. (E.g. just use another field that **does** provide a good distribution, or use multiple fields) – Bert Huijben Apr 16 '10 at 09:12
  • I don't think there's a problem with having `GetHashCode` perform memory allocations, *provided that it only does so the first time it's used* (with subsequent invocations simply returning a cached result). The important thing is not that one should go to great lengths to avoid collisions, but rather that one should avoid "systemic" collisions. If a type has two `int` fields `oldX` and `newX` which frequently differ by one, a hash value of `oldX^newX` would assign 90% of such records hash values of 1, 2, 4, or 8. Using `oldX+newX` [unchecked arithmetic] might generate more collisions... – supercat Sep 07 '13 at 21:02
  • 1
    ...than would more sophisticated function, but a collection of 1,000,000 things that have 500,000 different hash values will very well if each hash value has two associated things, and very badly if one hash value has 500,001 things and the others have one each. – supercat Sep 07 '13 at 21:04
27

Up until recently my answer would have been very close to Jon Skeet's here. However, I recently started a project which used power-of-two hash tables, that is hash tables where the size of the internal table is 8, 16, 32, etc. There's a good reason for favouring prime-number sizes, but there are some advantages to power-of-two sizes too.

And it pretty much sucked. So after a bit of experimentation and research I started re-hashing my hashes with the following:

public static int ReHash(int source)
{
  unchecked
  {
    ulong c = 0xDEADBEEFDEADBEEF + (ulong)source;
    ulong d = 0xE2ADBEEFDEADBEEF ^ c;
    ulong a = d += c = c << 15 | c >> -15;
    ulong b = a += d = d << 52 | d >> -52;
    c ^= b += a = a << 26 | a >> -26;
    d ^= c += b = b << 51 | b >> -51;
    a ^= d += c = c << 28 | c >> -28;
    b ^= a += d = d << 9 | d >> -9;
    c ^= b += a = a << 47 | a >> -47;
    d ^= c += b << 54 | b >> -54;
    a ^= d += c << 32 | c >> 32;
    a += d << 25 | d >> -25;
    return (int)(a >> 1);
  }
}

And then my power-of-two hash table didn't suck any more.

This disturbed me though, because the above shouldn't work. Or more precisely, it shouldn't work unless the original GetHashCode() was poor in a very particular way.

Re-mixing a hashcode can't improve a great hashcode, because the only possible effect is that we introduce a few more collisions.

Re-mixing a hash code can't improve a terrible hash code, because the only possible effect is we change e.g. a large number of collisions on value 53 to a large number of value 18,3487,291.

Re-mixing a hash code can only improve a hash code that did at least fairly well in avoiding absolute collisions throughout its range (232 possible values) but badly at avoiding collisions when modulo'd down for actual use in a hash table. While the simpler modulo of a power-of-two table made this more apparent, it was also having a negative effect with the more common prime-number tables, that just wasn't as obvious (the extra work in rehashing would outweigh the benefit, but the benefit would still be there).

Edit: I was also using open-addressing, which would also have increased the sensitivity to collision, perhaps more so than the fact it was power-of-two.

And well, it was disturbing how much the string.GetHashCode() implementations in .NET (or study here) could be improved this way (on the order of tests running about 20-30 times faster due to fewer collisions) and more disturbing how much my own hash codes could be improved (much more than that).

All the GetHashCode() implementations I'd coded in the past, and indeed used as the basis of answers on this site, were much worse than I'd throught. Much of the time it was "good enough" for much of the uses, but I wanted something better.

So I put that project to one side (it was a pet project anyway) and started looking at how to produce a good, well-distributed hash code in .NET quickly.

In the end I settled on porting SpookyHash to .NET. Indeed the code above is a fast-path version of using SpookyHash to produce a 32-bit output from a 32-bit input.

Now, SpookyHash is not a nice quick to remember piece of code. My port of it is even less so because I hand-inlined a lot of it for better speed*. But that's what code reuse is for.

Then I put that project to one side, because just as the original project had produced the question of how to produce a better hash code, so that project produced the question of how to produce a better .NET memcpy.

Then I came back, and produced a lot of overloads to easily feed just about all of the native types (except decimal†) into a hash code.

It's fast, for which Bob Jenkins deserves most of the credit because his original code I ported from is faster still, especially on 64-bit machines which the algorithm is optimised for‡.

The full code can be seen at https://bitbucket.org/JonHanna/spookilysharp/src but consider that the code above is a simplified version of it.

However, since it's now already written, one can make use of it more easily:

public override int GetHashCode()
{
  var hash = new SpookyHash();
  hash.Update(field1);
  hash.Update(field2);
  hash.Update(field3);
  return hash.Final().GetHashCode();
}

It also takes seed values, so if you need to deal with untrusted input and want to protect against Hash DoS attacks you can set a seed based on uptime or similar, and make the results unpredictable by attackers:

private static long hashSeed0 = Environment.TickCount;
private static long hashSeed1 = DateTime.Now.Ticks;
public override int GetHashCode()
{
  //produce different hashes ever time this application is restarted
  //but remain consistent in each run, so attackers have a harder time
  //DoSing the hash tables.
  var hash = new SpookyHash(hashSeed0, hashSeed1);
  hash.Update(field1);
  hash.Update(field2);
  hash.Update(field3);
  return hash.Final().GetHashCode();
}

*A big surprise in this is that hand-inlining a rotation method that returned (x << n) | (x >> -n) improved things. I would have been sure that the jitter would have inlined that for me, but profiling showed otherwise.

decimal isn't native from the .NET perspective though it is from the C#. The problem with it is that its own GetHashCode() treats precision as significant while its own Equals() does not. Both are valid choices, but not mixed like that. In implementing your own version, you need to choose to do one, or the other, but I can't know which you'd want.

‡By way of comparison. If used on a string, the SpookyHash on 64 bits is considerably faster than string.GetHashCode() on 32 bits which is slightly faster than string.GetHashCode() on 64 bits, which is considerably faster than SpookyHash on 32 bits, though still fast enough to be a reasonable choice.

Glenn Slayden
  • 17,543
  • 3
  • 114
  • 108
Jon Hanna
  • 110,372
  • 10
  • 146
  • 251
  • When combining multiple hash values into one, I tend to use `long` values for the intermediate results, and then munge the final result down to an `int`. Does that seem like a good idea? My concern is that one uses e.g. hash=(hash*31)+nextField, then pairs of matching values will only affect the upper 27 bits of the hash. Letting the calculation extend to a `long` and wrapping stuff in would minimize that danger. – supercat Apr 24 '14 at 21:31
  • @supercat it depends on the distribution of your final munging. The SpookilySharp library would ensure that the distribution was good, ideally (because it won't need object creation) by passing a pointer to a blittable type, or passing one of the enumerables it handles directly, but if you don't already have blittable data or a suitable enumeration, then calling `.Update()` with the multiple values as per the answer above will do the trick. – Jon Hanna Apr 24 '14 at 22:48
  • @JonHanna would you be willing to be more precise with the problematic behavior you encountered? I'm trying to implement a library that makes implementing value objects trivial ([ValueUtils](https://github.com/EamonNerbonne/ValueUtils)) and I'd love a testset demonstrating poor hash miscibility in power-of-two hashtables. – Eamon Nerbonne Jun 01 '14 at 14:19
  • @EamonNerbonne I don't really have anything more precise than "overall time was slower that way". As I added in an edit, the fact that I was using open-addressing may have been more important than the power-of-two factor. I do plan to do some test cases on a particular project where I'll be comparing a few different approaches, so I may have a better answer for you after that, though that's not a high-priority (a personal project with no pressing need, so I'll get to it when I get to it...) – Jon Hanna Jun 02 '14 at 14:01
  • @JonHanna: yeah I know how the personal project schedule goes - good luck! In any case, I see I didn't phrase that last comment well: I meant to ask for the problematic input, and not necessarily the details of the problems that resulted. I'd love to use that as a test set (or inspiration for a test set). In any case - good luck with your pet project :-). – Eamon Nerbonne Jun 02 '14 at 15:23
  • I'd bet that your `ReHash` is a big overkill. I guess, it works well, but it may be even slower than cryptographic hashed which are (sort of) proven to work perfectly. Java also uses power-of-two sized tables and there used to be [a rather complicated re-hashing](http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/HashMap.java#HashMap.hash%28int%29). [It's got simplified](http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8-b132/java/util/HashMap.java#HashMap.hash%28java.lang.Object%29) since tree nodes for collisions were introduced. – maaartinus Nov 11 '17 at 22:01
  • @maaartinus in terms of speed and distribution, it's well demonstrated. I'm now of the opinion that it's more trouble than it's worth for small values. I'd still use the fuller implementation of SpookyHash when it comes to hashing very large values like file contents. – Jon Hanna Nov 11 '17 at 22:56
25

As of https://github.com/dotnet/coreclr/pull/14863, there is a new way to generate hash codes that is super simple! Just write

public override int GetHashCode()
    => HashCode.Combine(field1, field2, field3);

This will generate a quality hash code without you having to worry about the implementation details.

James Ko
  • 32,215
  • 30
  • 128
  • 239
  • That looks like a sweet addition... any way to know what version of .NET Core that'll ship in? – Dan J Dec 14 '17 at 00:37
  • 1
    @DanJ What a happy coincidence, the `HashCode` changes for corefx were merged just a couple of hours before your comment :) The type is slated to ship in .NET Core 2.1. – James Ko Dec 14 '17 at 00:41
  • That is awesome - and quite the turnaround time. Upvoted. :) – Dan J Dec 14 '17 at 00:48
  • @DanJ Even better news-- it should be available right now on the nightly builds of CoreFX hosted on the dotnet-core MyGet feed. – James Ko Dec 15 '17 at 23:44
  • Sweet - that doesn't help me at work, since we're not quite *that* bleeding-edge, but good to know. Cheers! – Dan J Dec 17 '17 at 22:18
  • Here is a drop-in polyfill package that you can use for everything .NET 4.0+ (System.HashCode included): https://www.nuget.org/packages/Gapotchenko.FX – ogggre Mar 30 '19 at 12:49
13

This is a good one:

/// <summary>
/// Helper class for generating hash codes suitable 
/// for use in hashing algorithms and data structures like a hash table. 
/// </summary>
public static class HashCodeHelper
{
    private static int GetHashCodeInternal(int key1, int key2)
    {
        unchecked
        {
           var num = 0x7e53a269;
           num = (-1521134295 * num) + key1;
           num += (num << 10);
           num ^= (num >> 6);

           num = ((-1521134295 * num) + key2);
           num += (num << 10);
           num ^= (num >> 6);

           return num;
        }
    }

    /// <summary>
    /// Returns a hash code for the specified objects
    /// </summary>
    /// <param name="arr">An array of objects used for generating the 
    /// hash code.</param>
    /// <returns>
    /// A hash code, suitable for use in hashing algorithms and data 
    /// structures like a hash table. 
    /// </returns>
    public static int GetHashCode(params object[] arr)
    {
        int hash = 0;
        foreach (var item in arr)
            hash = GetHashCodeInternal(hash, item.GetHashCode());
        return hash;
    }

    /// <summary>
    /// Returns a hash code for the specified objects
    /// </summary>
    /// <param name="obj1">The first object.</param>
    /// <param name="obj2">The second object.</param>
    /// <param name="obj3">The third object.</param>
    /// <param name="obj4">The fourth object.</param>
    /// <returns>
    /// A hash code, suitable for use in hashing algorithms and
    /// data structures like a hash table.
    /// </returns>
    public static int GetHashCode<T1, T2, T3, T4>(T1 obj1, T2 obj2, T3 obj3,
        T4 obj4)
    {
        return GetHashCode(obj1, GetHashCode(obj2, obj3, obj4));
    }

    /// <summary>
    /// Returns a hash code for the specified objects
    /// </summary>
    /// <param name="obj1">The first object.</param>
    /// <param name="obj2">The second object.</param>
    /// <param name="obj3">The third object.</param>
    /// <returns>
    /// A hash code, suitable for use in hashing algorithms and data 
    /// structures like a hash table. 
    /// </returns>
    public static int GetHashCode<T1, T2, T3>(T1 obj1, T2 obj2, T3 obj3)
    {
        return GetHashCode(obj1, GetHashCode(obj2, obj3));
    }

    /// <summary>
    /// Returns a hash code for the specified objects
    /// </summary>
    /// <param name="obj1">The first object.</param>
    /// <param name="obj2">The second object.</param>
    /// <returns>
    /// A hash code, suitable for use in hashing algorithms and data 
    /// structures like a hash table. 
    /// </returns>
    public static int GetHashCode<T1, T2>(T1 obj1, T2 obj2)
    {
        return GetHashCodeInternal(obj1.GetHashCode(), obj2.GetHashCode());
    }
}

And here is how to use it:

private struct Key
{
    private Type _type;
    private string _field;

    public Type Type { get { return _type; } }
    public string Field { get { return _field; } }

    public Key(Type type, string field)
    {
        _type = type;
        _field = field;
    }

    public override int GetHashCode()
    {
        return HashCodeHelper.GetHashCode(_field, _type);
    }

    public override bool Equals(object obj)
    {
        if (!(obj is Key))
            return false;
        var tf = (Key)obj;
        return tf._field.Equals(_field) && tf._type.Equals(_type);
    }
}
Magnus
  • 45,362
  • 8
  • 80
  • 118
  • 1
    How are the Keys determined? GetHashCode() doesn't take any parameters, so it needs to call this one with two Keys that need to be determined somehow. Sorry, without further explanation this only looks clever, but not that good. – Michael Stum Oct 07 '10 at 17:28
  • And why do you need the generic overloads? The type is not important (and not used in your code) since *all* objects have a `GetHashCode()` method, so you can always use the method with the `params` array parameter. Or am I missing something here? – gehho Oct 08 '10 at 09:31
  • It's about performance, avoid the loop for smaller <= 4 fields. But I guess the generics could be skipped and just use object instead. – Magnus Oct 08 '10 at 09:57
  • 4
    When you'd use object instead of generics you'd get boxing and memory allocations, which you don't want in GetHashCode. So generics are the way to go. – CodesInChaos Nov 21 '10 at 22:26
  • 1
    The trailing shift/xor steps (`h += (h << 10); h ^= (h >> 6); h += (h << 3); h ^= (h >> 11); h += (h << 15);` have a codesmell: they do not depend on any of the input and look awfully redundant to me. – sehe Apr 22 '11 at 19:54
  • @nawfal what speed considerations do you have? – Magnus Dec 24 '12 at 11:29
  • @Magnus nothing in particular other than the general rule that hashing must be fast. This can't be as fast as I would love it to be. But as I said this gives better distribution of values which may be suitable for some cases. – nawfal Dec 25 '12 at 10:41
  • @nawfal Running this 100 million times takes about 390 ms. Running the solution suggested by Jon Skeet 100 million times takes about 320 ms so it is not a huge difference. – Magnus Dec 25 '12 at 10:59
  • 1
    @Magnus yes right, I'll delete my original comment. Just a little note that this may not be as fast as some other solutions here, but as you say shouldn't matter. The distribution is great, better than most solutions here, so +1 from me! :) – nawfal Dec 25 '12 at 11:28
  • How does this compare in quality (distribution) and performance to simply using a `long` intermediate, with multiplication of each input by a large prime? E.g. for two values, something like this one liner: `return ((long)v1 * 805306457 + (long)v2 * 189783887).GetHashCode();` [The primes are chosen to avoid numeric overflow of long, in a checked environment, and to tend to set different bits.] – ToolmakerSteve Mar 01 '18 at 01:42
11

Here is another fluent implementation of the algorithm posted above by Jon Skeet, but which includes no allocations or boxing operations:

public static class Hash
{
    public const int Base = 17;

    public static int HashObject(this int hash, object obj)
    {
        unchecked { return hash * 23 + (obj == null ? 0 : obj.GetHashCode()); }
    }

    public static int HashValue<T>(this int hash, T value)
        where T : struct
    {
        unchecked { return hash * 23 + value.GetHashCode(); }
    }
}

Usage:

public class MyType<T>
{
    public string Name { get; set; }

    public string Description { get; set; }

    public int Value { get; set; }

    public IEnumerable<T> Children { get; set; }

    public override int GetHashCode()
    {
        return Hash.Base
            .HashObject(this.Name)
            .HashObject(this.Description)
            .HashValue(this.Value)
            .HashObject(this.Children);
    }
}

The compiler will ensure HashValue is not called with a class due to the generic type constraint. But there is no compiler support for HashObject since adding a generic argument also adds a boxing operation.

Community
  • 1
  • 1
Scott Wegner
  • 7,263
  • 2
  • 39
  • 55
8

Here is my simplistic approach. I am using the classic builder pattern for this. It is typesafe (no boxing/unboxing) and also compatbile with .NET 2.0 (no extension methods etc.).

It is used like this:

public override int GetHashCode()
{
    HashBuilder b = new HashBuilder();
    b.AddItems(this.member1, this.member2, this.member3);
    return b.Result;
} 

And here is the acutal builder class:

internal class HashBuilder
{
    private const int Prime1 = 17;
    private const int Prime2 = 23;
    private int result = Prime1;

    public HashBuilder()
    {
    }

    public HashBuilder(int startHash)
    {
        this.result = startHash;
    }

    public int Result
    {
        get
        {
            return this.result;
        }
    }

    public void AddItem<T>(T item)
    {
        unchecked
        {
            this.result = this.result * Prime2 + item.GetHashCode();
        }
    }

    public void AddItems<T1, T2>(T1 item1, T2 item2)
    {
        this.AddItem(item1);
        this.AddItem(item2);
    }

    public void AddItems<T1, T2, T3>(T1 item1, T2 item2, T3 item3)
    {
        this.AddItem(item1);
        this.AddItem(item2);
        this.AddItem(item3);
    }

    public void AddItems<T1, T2, T3, T4>(T1 item1, T2 item2, T3 item3, 
        T4 item4)
    {
        this.AddItem(item1);
        this.AddItem(item2);
        this.AddItem(item3);
        this.AddItem(item4);
    }

    public void AddItems<T1, T2, T3, T4, T5>(T1 item1, T2 item2, T3 item3, 
        T4 item4, T5 item5)
    {
        this.AddItem(item1);
        this.AddItem(item2);
        this.AddItem(item3);
        this.AddItem(item4);
        this.AddItem(item5);
    }        

    public void AddItems<T>(params T[] items)
    {
        foreach (T item in items)
        {
            this.AddItem(item);
        }
    }
}
bitbonk
  • 48,890
  • 37
  • 186
  • 278
  • you can avoid the object creation inside gethashcode function as in Mangus's answer. Just call the damn static hash functions (who cares about starter hash). Also, you could use `AddItems(params T[] items)` method more often in the helper class (than calling `AddItem(T)` each time). – nawfal Apr 14 '13 at 12:52
  • And what benefit do you find doing `this.result * Prime2 * item.GetHashCode()` when often used is `this.result * Prime2 + item.GetHashCode()`? – nawfal Apr 14 '13 at 12:54
  • I can't use `AddItems(params T[] items)` more often because `typeof(T1) != typeof(T2)` etc. – bitbonk Apr 15 '13 at 06:25
6

ReSharper users can generate GetHashCode, Equals, and others with ReSharper -> Edit -> Generate Code -> Equality Members.

// ReSharper's GetHashCode looks like this
public override int GetHashCode() {
    unchecked {
        int hashCode = Id;
        hashCode = (hashCode * 397) ^ IntMember;
        hashCode = (hashCode * 397) ^ OtherIntMember;
        hashCode = (hashCode * 397) ^ (RefMember != null ? RefMember.GetHashCode() : 0);
        // ...
        return hashCode;
    }
}
Charles Burns
  • 10,310
  • 7
  • 64
  • 81
6

If we have no more than 8 properties (hopefully), here is another alternative.

ValueTuple is a struct and appears to have a solid GetHashCode implementation.

That means we could simply do this:

// Yay, no allocations and no custom implementations!
public override int GetHashCode() => (this.PropA, this.PropB).GetHashCode();

Let's take a look at .NET Core's current implementation for ValueTuple's GetHashCode.

This is from ValueTuple:

    internal static int CombineHashCodes(int h1, int h2)
    {
        return HashHelpers.Combine(HashHelpers.Combine(HashHelpers.RandomSeed, h1), h2);
    }

    internal static int CombineHashCodes(int h1, int h2, int h3)
    {
        return HashHelpers.Combine(CombineHashCodes(h1, h2), h3);
    }

And this is from HashHelper:

    public static readonly int RandomSeed = Guid.NewGuid().GetHashCode();

    public static int Combine(int h1, int h2)
    {
        unchecked
        {
            // RyuJIT optimizes this to use the ROL instruction
            // Related GitHub pull request: dotnet/coreclr#1830
            uint rol5 = ((uint)h1 << 5) | ((uint)h1 >> 27);
            return ((int)rol5 + h1) ^ h2;
        }
    }

In English:

  • Left rotate (circular shift) h1 by 5 positions.
  • Add the result and h1 together.
  • XOR the result with h2.
  • Start by performing the above operation on { static random seed, h1 }.
  • For each further item, perform the operation on the previous result and the next item (e.g. h2).

It would be nice to know more about the properties of this ROL-5 hash code algorithm.

Regrettably, deferring to ValueTuple for our own GetHashCode may not be as fast as we would like and expect. This comment in a related discussion illustrates that directly calling HashHelpers.Combine is more performant. On the flip side, that one is internal, so we'd have to copy the code, sacrificing much of what we had gained here. Also, we'd be responsible for remembering to first Combine with the random seed. I don't know what the consequences are if we skip that step.

Timo
  • 7,992
  • 4
  • 49
  • 67
  • 1
    Assuming `h1 >> 27` is 0 to ignore it, `h1 << 5` equals `h1 * 32` therefore it is same as `h1 * 33 ^ h2`. According to [this page](http://eternallyconfuzzled.com/tuts/algorithms/jsw_tut_hashing.aspx), it is called "Modified Bernstein". – cactuaroid Aug 17 '18 at 14:28
4

Most of my work is done with database connectivity which means that my classes all have a unique identifier from the database. I always use the ID from the database to generate the hashcode.

// Unique ID from database
private int _id;

...    
{
  return _id.GetHashCode();
}
Mark G
  • 368
  • 1
  • 6
  • 1
    That means that if you have objects Person and Account and they both have and ID = 1, they will have the same hash code. And that is not ok. – pero Mar 22 '10 at 15:28
  • 18
    Actually the comment above is incorrect. There will always be the possibility of hash-code collisions (a hash code only locates the bucket, not the individual object). So such an implementation - for a hashcode containing mixed objects - would lead to a lot of collisions, which is undesirable, but it would be absolutely fine if you only ever had objects of a single type in your hashtables. Also it doesn't distribute evenly, however neither does the base implementation on system.object, so I wouldn't worry about it too much... – piers7 Mar 29 '10 at 02:14
  • 3
    The hash code can just be the id, since the id is an integer. There's no need to call GetHashCode on an integer (it's an identity function) – Darrel Lee Nov 23 '12 at 19:18
  • 3
    @DarrelLee but tomo his _id could be a Guid. It's a good coding practice to do `_id.GetHashCode` as the intent is clear. – nawfal Apr 14 '13 at 12:57
  • @DarrelLee, it is a not a good option because sequential IDs from the database don't provide a good distribution – Trident D'Gao Jun 28 '13 at 20:04
  • 3
    @1224 depending on use patterns it can be horrible for the reason you give, but it can also be great; if you've a sequence of such numbers with no holes, then you've a perfect hash, better than any algorithm can produce. If you know that's the case you can even count on it and skip the equality check. – Jon Hanna Jan 14 '14 at 18:29
3

Pretty much similar to nightcoder's solution except it's easier to raise primes if you want to.

PS: This is one of those times where you puke a little in your mouth, knowing that this could be refactored into one method with 9 default's but it would be slower, so you just close your eyes and try to forget about it.

/// <summary>
/// Try not to look at the source code. It works. Just rely on it.
/// </summary>
public static class HashHelper
{
    private const int PrimeOne = 17;
    private const int PrimeTwo = 23;

    public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7, T8, T9, T10>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7, T8 arg8, T9 arg9, T10 arg10)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();
            hash = hash * PrimeTwo + arg3.GetHashCode();
            hash = hash * PrimeTwo + arg4.GetHashCode();
            hash = hash * PrimeTwo + arg5.GetHashCode();
            hash = hash * PrimeTwo + arg6.GetHashCode();
            hash = hash * PrimeTwo + arg7.GetHashCode();
            hash = hash * PrimeTwo + arg8.GetHashCode();
            hash = hash * PrimeTwo + arg9.GetHashCode();
            hash = hash * PrimeTwo + arg10.GetHashCode();

            return hash;
        }
    }

    public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7, T8, T9>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7, T8 arg8, T9 arg9)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();
            hash = hash * PrimeTwo + arg3.GetHashCode();
            hash = hash * PrimeTwo + arg4.GetHashCode();
            hash = hash * PrimeTwo + arg5.GetHashCode();
            hash = hash * PrimeTwo + arg6.GetHashCode();
            hash = hash * PrimeTwo + arg7.GetHashCode();
            hash = hash * PrimeTwo + arg8.GetHashCode();
            hash = hash * PrimeTwo + arg9.GetHashCode();

            return hash;
        }
    }

    public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7, T8>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7, T8 arg8)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();
            hash = hash * PrimeTwo + arg3.GetHashCode();
            hash = hash * PrimeTwo + arg4.GetHashCode();
            hash = hash * PrimeTwo + arg5.GetHashCode();
            hash = hash * PrimeTwo + arg6.GetHashCode();
            hash = hash * PrimeTwo + arg7.GetHashCode();
            hash = hash * PrimeTwo + arg8.GetHashCode();

            return hash;
        }
    }

    public static int GetHashCode<T1, T2, T3, T4, T5, T6, T7>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6, T7 arg7)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();
            hash = hash * PrimeTwo + arg3.GetHashCode();
            hash = hash * PrimeTwo + arg4.GetHashCode();
            hash = hash * PrimeTwo + arg5.GetHashCode();
            hash = hash * PrimeTwo + arg6.GetHashCode();
            hash = hash * PrimeTwo + arg7.GetHashCode();

            return hash;
        }
    }

    public static int GetHashCode<T1, T2, T3, T4, T5, T6>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();
            hash = hash * PrimeTwo + arg3.GetHashCode();
            hash = hash * PrimeTwo + arg4.GetHashCode();
            hash = hash * PrimeTwo + arg5.GetHashCode();
            hash = hash * PrimeTwo + arg6.GetHashCode();

            return hash;
        }
    }

    public static int GetHashCode<T1, T2, T3, T4, T5>(T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();
            hash = hash * PrimeTwo + arg3.GetHashCode();
            hash = hash * PrimeTwo + arg4.GetHashCode();
            hash = hash * PrimeTwo + arg5.GetHashCode();

            return hash;
        }
    }

    public static int GetHashCode<T1, T2, T3, T4>(T1 arg1, T2 arg2, T3 arg3, T4 arg4)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();
            hash = hash * PrimeTwo + arg3.GetHashCode();
            hash = hash * PrimeTwo + arg4.GetHashCode();

            return hash;
        }
    }

    public static int GetHashCode<T1, T2, T3>(T1 arg1, T2 arg2, T3 arg3)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();
            hash = hash * PrimeTwo + arg3.GetHashCode();

            return hash;
        }
    }

    public static int GetHashCode<T1, T2>(T1 arg1, T2 arg2)
    {
        unchecked
        {
            int hash = PrimeOne;
            hash = hash * PrimeTwo + arg1.GetHashCode();
            hash = hash * PrimeTwo + arg2.GetHashCode();

            return hash;
        }
    }
}
Dbl
  • 5,634
  • 3
  • 41
  • 66
2

Microsoft lead for several way of hashing...

//for classes that contain a single int value
return this.value;

//for classes that contain multiple int value
return x ^ y;

//for classes that contain single number bigger than int    
return ((int)value ^ (int)(value >> 32)); 

//for classes that contain class instance fields which inherit from object
return obj1.GetHashCode();

//for classes that contain multiple class instance fields which inherit from object
return obj1.GetHashCode() ^ obj2.GetHashCode() ^ obj3.GetHashCode(); 

I can guess that for multiple big int you can use this:

int a=((int)value1 ^ (int)(value1 >> 32));
int b=((int)value2 ^ (int)(value2 >> 32));
int c=((int)value3 ^ (int)(value3 >> 32));
return a ^ b ^ c;

And same for multi-type: all converted first to int using GetHashCode() then the int values will be xor'ed and the result is your hash.

For those who use hash as ID (I mean an unique value), hash is naturally limited to a number of digits, I think it was 5 bytes for hashing algorithm, at least MD5.

You may turn multiple values to a hashed value and some of them be same, so don't use it as an identifier. (maybe some day I am going to use your component)

Sebastian Hofmann
  • 1,440
  • 6
  • 15
  • 21
Hassan Faghihi
  • 1,888
  • 1
  • 37
  • 55
  • 8
    Xoring integers to make a hashcode is a well-known antipattern that tends to result in a particularly high number of collisions with real-world values. – Jon Hanna Jan 14 '14 at 09:36
  • Every one here use integer, and there been never any kind of guarantee for hash to be same, it just tried to be as much vary as there are few collisions to happen. – Hassan Faghihi Sep 16 '15 at 05:59
  • Yes, but your second and fifth don't try to avoid collisions. – Jon Hanna Sep 16 '15 at 08:44
  • Not sure which thread it was... but it did the same, https://msdn.microsoft.com/en-us/library/system.object.gethashcode(v=vs.110).aspx – Hassan Faghihi Sep 19 '15 at 12:37
  • 1
    Yes, that antipattern is quite common. – Jon Hanna Sep 19 '15 at 14:06
  • that's why i relay on that, but thanks for lighten things up... the other thing is does the other patter has less calculation time? you know, kind of, `collision vs calculation time` matter is also there – Hassan Faghihi Sep 20 '15 at 13:34
  • 2
    There's a balance to reach. Use a really good hash code like Spookyhash and you'll get much, much better collision avoidance but it will have much more calculation time than any of these (but when it comes to hashing very large amounts of data, Spookyhash is extremely quick). A simple shift on one of the values before xoring is only marginal extra cost for a good reduction in collision. Prime-number multiplication increasing both time and quality again. Which is better between shift or mult is hence debatable. Plain xor though very often has a lot of collisions on real data and is best avoided – Jon Hanna Sep 20 '15 at 16:57
1

This is a static helper class that implements Josh Bloch's implementation; and provides explicit overloads to "prevent" boxing, and also to implement the hash specifically for the long primitives.

You can pass a string comparison that matches your equals implementation.

Because the Hash output is always an int, you can just chain Hash calls.

using System;
using System.Collections;
using System.Collections.Generic;
using System.Reflection;
using System.Runtime.CompilerServices;


namespace Sc.Util.System
{
    /// <summary>
    /// Static methods that allow easy implementation of hashCode. Example usage:
    /// <code>
    /// public override int GetHashCode()
    ///     => HashCodeHelper.Seed
    ///         .Hash(primitiveField)
    ///         .Hsh(objectField)
    ///         .Hash(iEnumerableField);
    /// </code>
    /// </summary>
    public static class HashCodeHelper
    {
        /// <summary>
        /// An initial value for a hashCode, to which is added contributions from fields.
        /// Using a non-zero value decreases collisions of hashCode values.
        /// </summary>
        public const int Seed = 23;

        private const int oddPrimeNumber = 37;


        /// <summary>
        /// Rotates the seed against a prime number.
        /// </summary>
        /// <param name="aSeed">The hash's first term.</param>
        /// <returns>The new hash code.</returns>
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        private static int rotateFirstTerm(int aSeed)
        {
            unchecked {
                return HashCodeHelper.oddPrimeNumber * aSeed;
            }
        }


        /// <summary>
        /// Contributes a boolean to the developing HashCode seed.
        /// </summary>
        /// <param name="aSeed">The developing HashCode value or seed.</param>
        /// <param name="aBoolean">The value to contribute.</param>
        /// <returns>The new hash code.</returns>
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static int Hash(this int aSeed, bool aBoolean)
        {
            unchecked {
                return HashCodeHelper.rotateFirstTerm(aSeed)
                        + (aBoolean
                                ? 1
                                : 0);
            }
        }

        /// <summary>
        /// Contributes a char to the developing HashCode seed.
        /// </summary>
        /// <param name="aSeed">The developing HashCode value or seed.</param>
        /// <param name="aChar">The value to contribute.</param>
        /// <returns>The new hash code.</returns>
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static int Hash(this int aSeed, char aChar)
        {
            unchecked {
                return HashCodeHelper.rotateFirstTerm(aSeed)
                        + aChar;
            }
        }

        /// <summary>
        /// Contributes an int to the developing HashCode seed.
        /// Note that byte and short are handled by this method, through implicit conversion.
        /// </summary>
        /// <param name="aSeed">The developing HashCode value or seed.</param>
        /// <param name="aInt">The value to contribute.</param>
        /// <returns>The new hash code.</returns>
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static int Hash(this int aSeed, int aInt)
        {
            unchecked {
                return HashCodeHelper.rotateFirstTerm(aSeed)
                        + aInt;
            }
        }

        /// <summary>
        /// Contributes a long to the developing HashCode seed.
        /// </summary>
        /// <param name="aSeed">The developing HashCode value or seed.</param>
        /// <param name="aLong">The value to contribute.</param>
        /// <returns>The new hash code.</returns>
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static int Hash(this int aSeed, long aLong)
        {
            unchecked {
                return HashCodeHelper.rotateFirstTerm(aSeed)
                        + (int)(aLong ^ (aLong >> 32));
            }
        }

        /// <summary>
        /// Contributes a float to the developing HashCode seed.
        /// </summary>
        /// <param name="aSeed">The developing HashCode value or seed.</param>
        /// <param name="aFloat">The value to contribute.</param>
        /// <returns>The new hash code.</returns>
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static int Hash(this int aSeed, float aFloat)
        {
            unchecked {
                return HashCodeHelper.rotateFirstTerm(aSeed)
                        + Convert.ToInt32(aFloat);
            }
        }

        /// <summary>
        /// Contributes a double to the developing HashCode seed.
        /// </summary>
        /// <param name="aSeed">The developing HashCode value or seed.</param>
        /// <param name="aDouble">The value to contribute.</param>
        /// <returns>The new hash code.</returns>
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static int Hash(this int aSeed, double aDouble)
            => aSeed.Hash(Convert.ToInt64(aDouble));

        /// <summary>
        /// Contributes a string to the developing HashCode seed.
        /// </summary>
        /// <param name="aSeed">The developing HashCode value or seed.</param>
        /// <param name="aString">The value to contribute.</param>
        /// <param name="stringComparison">Optional comparison that creates the hash.</param>
        /// <returns>The new hash code.</returns>
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static int Hash(
                this int aSeed,
                string aString,
                StringComparison stringComparison = StringComparison.Ordinal)
        {
            if (aString == null)
                return aSeed.Hash(0);
            switch (stringComparison) {
                case StringComparison.CurrentCulture :
                    return StringComparer.CurrentCulture.GetHashCode(aString);
                case StringComparison.CurrentCultureIgnoreCase :
                    return StringComparer.CurrentCultureIgnoreCase.GetHashCode(aString);
                case StringComparison.InvariantCulture :
                    return StringComparer.InvariantCulture.GetHashCode(aString);
                case StringComparison.InvariantCultureIgnoreCase :
                    return StringComparer.InvariantCultureIgnoreCase.GetHashCode(aString);
                case StringComparison.OrdinalIgnoreCase :
                    return StringComparer.OrdinalIgnoreCase.GetHashCode(aString);
                default :
                    return StringComparer.Ordinal.GetHashCode(aString);
            }
        }

        /// <summary>
        /// Contributes a possibly-null array to the developing HashCode seed.
        /// Each element may be a primitive, a reference, or a possibly-null array.
        /// </summary>
        /// <param name="aSeed">The developing HashCode value or seed.</param>
        /// <param name="aArray">CAN be null.</param>
        /// <returns>The new hash code.</returns>
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static int Hash(this int aSeed, IEnumerable aArray)
        {
            if (aArray == null)
                return aSeed.Hash(0);
            int countPlusOne = 1; // So it differs from null
            foreach (object item in aArray) {
                ++countPlusOne;
                if (item is IEnumerable arrayItem) {
                    if (!object.ReferenceEquals(aArray, arrayItem))
                        aSeed = aSeed.Hash(arrayItem); // recursive call!
                } else
                    aSeed = aSeed.Hash(item);
            }
            return aSeed.Hash(countPlusOne);
        }

        /// <summary>
        /// Contributes a possibly-null array to the developing HashCode seed.
        /// You must provide the hash function for each element.
        /// </summary>
        /// <param name="aSeed">The developing HashCode value or seed.</param>
        /// <param name="aArray">CAN be null.</param>
        /// <param name="hashElement">Required: yields the hash for each element
        /// in <paramref name="aArray"/>.</param>
        /// <returns>The new hash code.</returns>
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static int Hash<T>(this int aSeed, IEnumerable<T> aArray, Func<T, int> hashElement)
        {
            if (aArray == null)
                return aSeed.Hash(0);
            int countPlusOne = 1; // So it differs from null
            foreach (T item in aArray) {
                ++countPlusOne;
                aSeed = aSeed.Hash(hashElement(item));
            }
            return aSeed.Hash(countPlusOne);
        }

        /// <summary>
        /// Contributes a possibly-null object to the developing HashCode seed.
        /// </summary>
        /// <param name="aSeed">The developing HashCode value or seed.</param>
        /// <param name="aObject">CAN be null.</param>
        /// <returns>The new hash code.</returns>
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static int Hash(this int aSeed, object aObject)
        {
            switch (aObject) {
                case null :
                    return aSeed.Hash(0);
                case bool b :
                    return aSeed.Hash(b);
                case char c :
                    return aSeed.Hash(c);
                case int i :
                    return aSeed.Hash(i);
                case long l :
                    return aSeed.Hash(l);
                case float f :
                    return aSeed.Hash(f);
                case double d :
                    return aSeed.Hash(d);
                case string s :
                    return aSeed.Hash(s);
                case IEnumerable iEnumerable :
                    return aSeed.Hash(iEnumerable);
            }
            return aSeed.Hash(aObject.GetHashCode());
        }


        /// <summary>
        /// This utility method uses reflection to iterate all specified properties that are readable
        /// on the given object, excluding any property names given in the params arguments, and
        /// generates a hashcode.
        /// </summary>
        /// <param name="aSeed">The developing hash code, or the seed: if you have no seed, use
        /// the <see cref="Seed"/>.</param>
        /// <param name="aObject">CAN be null.</param>
        /// <param name="propertySelector"><see cref="BindingFlags"/> to select the properties to hash.</param>
        /// <param name="ignorePropertyNames">Optional.</param>
        /// <returns>A hash from the properties contributed to <c>aSeed</c>.</returns>
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static int HashAllProperties(
                this int aSeed,
                object aObject,
                BindingFlags propertySelector
                        = BindingFlags.Instance
                        | BindingFlags.Public
                        | BindingFlags.GetProperty,
                params string[] ignorePropertyNames)
        {
            if (aObject == null)
                return aSeed.Hash(0);
            if ((ignorePropertyNames != null)
                    && (ignorePropertyNames.Length != 0)) {
                foreach (PropertyInfo propertyInfo in aObject.GetType()
                        .GetProperties(propertySelector)) {
                    if (!propertyInfo.CanRead
                            || (Array.IndexOf(ignorePropertyNames, propertyInfo.Name) >= 0))
                        continue;
                    aSeed = aSeed.Hash(propertyInfo.GetValue(aObject));
                }
            } else {
                foreach (PropertyInfo propertyInfo in aObject.GetType()
                        .GetProperties(propertySelector)) {
                    if (propertyInfo.CanRead)
                        aSeed = aSeed.Hash(propertyInfo.GetValue(aObject));
                }
            }
            return aSeed;
        }


        /// <summary>
        /// NOTICE: this method is provided to contribute a <see cref="KeyValuePair{TKey,TValue}"/> to
        /// the developing HashCode seed; by hashing the key and the value independently. HOWEVER,
        /// this method has a different name since it will not be automatically invoked by
        /// <see cref="Hash(int,object)"/>, <see cref="Hash(int,IEnumerable)"/>,
        /// or <see cref="HashAllProperties"/> --- you MUST NOT mix this method with those unless
        /// you are sure that no KeyValuePair instances will be passed to those methods; or otherwise
        /// the generated hash code will not be consistent. This method itself ALSO will not invoke
        /// this method on the Key or Value here if that itself is a KeyValuePair.
        /// </summary>
        /// <param name="aSeed">The developing HashCode value or seed.</param>
        /// <param name="keyValuePair">The value to contribute.</param>
        /// <returns>The new hash code.</returns>
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static int HashKeyAndValue<TKey, TValue>(this int aSeed, KeyValuePair<TKey, TValue> keyValuePair)
            => aSeed.Hash(keyValuePair.Key)
                    .Hash(keyValuePair.Value);

        /// <summary>
        /// NOTICE: this method is provided to contribute a collection of <see cref="KeyValuePair{TKey,TValue}"/>
        /// to the developing HashCode seed; by hashing the key and the value independently. HOWEVER,
        /// this method has a different name since it will not be automatically invoked by
        /// <see cref="Hash(int,object)"/>, <see cref="Hash(int,IEnumerable)"/>,
        /// or <see cref="HashAllProperties"/> --- you MUST NOT mix this method with those unless
        /// you are sure that no KeyValuePair instances will be passed to those methods; or otherwise
        /// the generated hash code will not be consistent. This method itself ALSO will not invoke
        /// this method on a Key or Value here if that itself is a KeyValuePair or an Enumerable of
        /// KeyValuePair.
        /// </summary>
        /// <param name="aSeed">The developing HashCode value or seed.</param>
        /// <param name="keyValuePairs">The values to contribute.</param>
        /// <returns>The new hash code.</returns>
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static int HashKeysAndValues<TKey, TValue>(
                this int aSeed,
                IEnumerable<KeyValuePair<TKey, TValue>> keyValuePairs)
        {
            if (keyValuePairs == null)
                return aSeed.Hash(null);
            foreach (KeyValuePair<TKey, TValue> keyValuePair in keyValuePairs) {
                aSeed = aSeed.HashKeyAndValue(keyValuePair);
            }
            return aSeed;
        }
    }
}
Steven Coco
  • 542
  • 6
  • 16
0

I ran into an issue with floats and decimals using the implementation selected as the answer above.

This test fails (floats; hash is the same even though I switched 2 values to be negative):

        var obj1 = new { A = 100m, B = 100m, C = 100m, D = 100m};
        var obj2 = new { A = 100m, B = 100m, C = -100m, D = -100m};
        var hash1 = ComputeHash(obj1.A, obj1.B, obj1.C, obj1.D);
        var hash2 = ComputeHash(obj2.A, obj2.B, obj2.C, obj2.D);
        Assert.IsFalse(hash1 == hash2, string.Format("Hashcode values should be different   hash1:{0}  hash2:{1}",hash1,hash2));

But this test passes (with ints):

        var obj1 = new { A = 100m, B = 100m, C = 100, D = 100};
        var obj2 = new { A = 100m, B = 100m, C = -100, D = -100};
        var hash1 = ComputeHash(obj1.A, obj1.B, obj1.C, obj1.D);
        var hash2 = ComputeHash(obj2.A, obj2.B, obj2.C, obj2.D);
        Assert.IsFalse(hash1 == hash2, string.Format("Hashcode values should be different   hash1:{0}  hash2:{1}",hash1,hash2));

I changed my implementation to not use GetHashCode for the primitive types and it seems to work better

    private static int InternalComputeHash(params object[] obj)
    {
        unchecked
        {
            var result = (int)SEED_VALUE_PRIME;
            for (uint i = 0; i < obj.Length; i++)
            {
                var currval = result;
                var nextval = DetermineNextValue(obj[i]);
                result = (result * MULTIPLIER_VALUE_PRIME) + nextval;

            }
            return result;
        }
    }



    private static int DetermineNextValue(object value)
    {
        unchecked
        {

                int hashCode;
                if (value is short
                    || value is int
                    || value is byte
                    || value is sbyte
                    || value is uint
                    || value is ushort
                    || value is ulong
                    || value is long
                    || value is float
                    || value is double
                    || value is decimal)
                {
                    return Convert.ToInt32(value);
                }
                else
                {
                    return value != null ? value.GetHashCode() : 0;
                }
        }
    }
HokieMike
  • 655
  • 1
  • 7
  • 17
  • 2
    In case you intended otherwise `unchecked` does NOT affect `Convert.ToInt32`: `uint`, `long`, `float`, `double` and `decimal` can all overflow here. – Mark Hurd Sep 30 '14 at 04:28
0

In case you want to polyfill HashCode from netstandard2.1

public static class HashCode
{
    public static int Combine(params object[] instances)
    {
        int hash = 17;

        foreach (var i in instances)
        {
            hash = unchecked((hash * 31) + (i?.GetHashCode() ?? 0));
        }

        return hash;
    }
}

Note: If used with struct, it will allocate memory due to boxing

Ivan Sanz Carasa
  • 1,141
  • 8
  • 16
0

Can try to adopt approach from C++ Boost libraries. Something like this:

class HashUtil
{
  public static int HashCombine(int seed, int other)
  {
    unchecked
    {
      return other + 0x9e3779b9 + (seed << 6) + (seed >> 2);
    }
  }
}

and then:

class MyClass
{
  private string _field1;
  private int _field2;
  private AnotherClass _field3;
  private YetAnotherClass _field4;

  public override int GetHashCode()
  {
    int result = HashUtil.HashCombine(_field1.GetHashCode(), _field2);
    result = HashUtil.HashCombine(result, _field3.GetHashCode());
    return HashUtil.HashCombine(result, _field4.GetHashCode());
  }
}
ivan.ukr
  • 2,853
  • 1
  • 23
  • 41
-1

I want to add my newest findings to this thread I came back to so often.

My current visual studio / project setup provides the functionallity to automatically refactors tuples to structs. This will generate a GetHashCode function like so:

        public override int GetHashCode()
        {
            int hashCode = -2088324004;
            hashCode = hashCode * -1521134295 + AuftragGesperrt.GetHashCode();
            hashCode = hashCode * -1521134295 + Auftrag_gesperrt_von.GetHashCode();
            hashCode = hashCode * -1521134295 + Auftrag_gesperrt_am.GetHashCode();
            return hashCode;
        }

EDIT: to clarify AuftragGesperrt, Auftrag_gesperrt_von and Auftrag_gesperrt_am are properties. If the microsoft devs use this function its probably not too bad of a solution.

t0b4cc0
  • 319
  • 4
  • 19