19

I want to use a date range (from one date to another date) as a key for a dictionary, so I wrote my own struct:

   struct DateRange
   {
      public DateTime Start;
      public DateTime End;

      public DateRange(DateTime start, DateTime end)
      {
         Start = start.Date;
         End = end.Date;
      }

      public override int GetHashCode()
      {
         // ???
      }
   }

What's the best way to implement GetHashCode so no two objects of a differing range will generate the same hash? I want hash collisions to be as unlikely as possible, though I understand Dictionary<> will still check the equality operator which I will also implement, but didn't want to pollute the example code too much. Thanks!

Jesse C. Slicer
  • 19,901
  • 3
  • 68
  • 87
Mike Christensen
  • 88,082
  • 50
  • 208
  • 326
  • 4
    Aside from GetHashCode: mutable structs and public fields are both generally a bad idea. – Jon Skeet Oct 18 '11 at 21:11
  • You think I should switch to a class with `public DateTime Start { get; set; }` instead? – Mike Christensen Oct 18 '11 at 21:14
  • 3
    No, you should not allow the fields to mutate **at all**. Make them `private` and `readonly`, set them via the constructor (as you do already), and provide read-only "get" properties to those fields. – Jesse C. Slicer Oct 18 '11 at 21:16
  • Ah gotcha - I'll do that instead. – Mike Christensen Oct 18 '11 at 21:18
  • Why, though? --- struct Vector3Int { public int x, y, z; GetHashCode() { return combination of x, y, and z } } --- How is this any worse than: --- struct Vector1Int { public int x; GetHashCode() { return x; } } And following this further, how is this any worse than just using an int (with respect to hash codes)? The outcome is the same: different value(s), different hash. – Hatchling Apr 26 '16 at 00:51
  • Following up on my previous comment, I can't imagine why any of the things you noted would be unacceptable on a struct that contains nothing but value types for all depth levels. (IE all members are structs, whose members are structs, whose members are structs... etc., recursively, until you reach basic value types (like int, float, etc.)) – Hatchling Apr 26 '16 at 01:15
  • @JonSkeet why are public fields in a struct bad? – johnny 5 Aug 11 '17 at 16:30
  • @MikeChristensen Why would you need to override get hashcode, the default struct get hashcode should work fine – johnny 5 Aug 11 '17 at 16:31
  • 1
    @johnny5: Public fields in general expose the *implementation* of a type rather than an API surface to use. Very occasionally it's appropriate - e.g. ValueTuple - but not usually. – Jon Skeet Aug 11 '17 at 16:33
  • Possible duplicate of [What is the best algorithm for an overridden System.Object.GetHashCode?](https://stackoverflow.com/questions/263400/what-is-the-best-algorithm-for-an-overridden-system-object-gethashcode) – J... Sep 18 '18 at 15:31

8 Answers8

25

You can use the method from Effective Java as Jon Skeet shows here. For your specific type:

public override int GetHashCode()
{
    unchecked // Overflow is fine, just wrap
    {
        int hash = 17;
        hash = hash * 23 + Start.GetHashCode();
        hash = hash * 23 + End.GetHashCode();
        return hash;
    }
}
Community
  • 1
  • 1
Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
  • 5
    Could you please explain, why do you use 17 and 23? – Max Kilovatiy Jun 15 '16 at 06:08
  • 1
    @MaxKvt because they're prime numbers – Martin Jul 26 '16 at 17:26
  • 2
    @Martin And where is the benefit from multiplication with prime numbers? – The incredible Jan Apr 04 '17 at 08:39
  • @TheincredibleJan My guess is that it improves the distribution of values after the modulo is applied (in this case modulo is implicit when int overflows). I'm not sure about this but I think `(x * prime) mod y` will not generate a collision for values of x between 0 and y. – Herman Apr 09 '18 at 09:46
  • If every hash has 17 and a factor of 23 then surely they can be factored out between all hashes wouldn't it be the same as just adding the hash codes of start and end together and be done with it ? – WDUK Jul 05 '21 at 21:43
13

In C# 7 you can do this:

public override int GetHashCode() => (Start, End).GetHashCode();

The ValueTuple is available in .NET Framework 4.7 and .NET Core, or via NuGet.

Not sure how well it performs, but I would be surprised if any custom code would beat it.

l33t
  • 18,692
  • 16
  • 103
  • 180
7

Not to reanimate the dead, but I came here looking for something, and for newer C# versions you can do

public override int GetHashCode()
{
    return HashCode.Combine(Start, End);
}

The source can currently be found here: https://github.com/dotnet/corert/blob/master/src/System.Private.CoreLib/shared/System/HashCode.cs

In my preliminary tests (using Jon Skeets micro benchmarking framework) it appears to be very similar if not the same as the accepted answer, in terms of performance.

VisualBean
  • 4,908
  • 2
  • 28
  • 57
5

I would trust Microsoft's implementation of GetHashCode() at the tuples and use something like this without any stupid magic:

public override int GetHashCode()
{
    Tuple.Create(x, y).GetHashCode();
}
The incredible Jan
  • 741
  • 10
  • 17
  • 14
    No!! You must absolutely NOT create heap objects in GetHashCode(). This code will eventually kill application performance. – l33t Sep 20 '19 at 08:52
  • The Tuple class will add to the heap. Instead (with C# 8 up), use just (x, y).GetHashCode(), which goes on the stack it's a value type. – dynamichael Feb 03 '22 at 21:09
3

Since DateTime.GetHashCode is internally based on Ticks, what about this:

    public override int GetHashCode()
    {
        return unchecked((int)(Start.Ticks ^ End.Ticks));
    }

Or, since you seem to be interested by the date parts (year, month, day), not the whole thing, this implementation uses the number of days between the two dates and should give almost no collision:

        public override int GetHashCode()
        {
            return unchecked((int)Start.Date.Year * 366 + Start.Date.DayOfYear + (End.Date - Start.Date).Days);
        }
Simon Mourier
  • 132,049
  • 21
  • 248
  • 298
2

Just as it might help someone in the future that uses visual studio pro (not sure if this also exist in community edition)

  • Select the desired properties (in your case all)
  • Press refacoring (CTRL + . or right click "Quick actions an refactorings")
  • Now you can select to implement Equals or GetHashcode (and probably it always takes the best known MS way to do it)
1

Something like this:) with a different prime number:)

public override int GetHashCode()
{
    unchecked  
    {
        int hash = 23;
        // Suitable nullity checks etc, of course :)
        hash = hash * 31 + Start.GetHashCode();
        hash = hash * 31 + End.GetHashCode();
        return hash;
    }
}

This is not the fastest implementation but it produces a good hash code. Joshua bloch indicates that as well and you also calculate the performance, ^ is usually faster. correct me if i m wrong.

See Jon Skeets impl for c# :

Community
  • 1
  • 1
DarthVader
  • 52,984
  • 76
  • 209
  • 300
  • 1
    -1: while this is good, it's a pretty close to direct lift from here: http://stackoverflow.com/questions/263400/what-is-the-best-algorithm-for-an-overridden-system-object-gethashcode/263416#263416 without attribution. – Jesse C. Slicer Oct 18 '11 at 21:14
  • very funny. attribution was to Joshua bloch which that links get this from.. thanks for the downvote. – DarthVader Oct 18 '11 at 21:15
  • 5
    Your four retribution downvotes are **also** totally unnecessary. – Jesse C. Slicer Oct 18 '11 at 21:18
  • i dont want to make it personal.but your comment was pointless, and i did give the credits to joshua bloch, who has pointed this hashcode impl. in his book. – DarthVader Oct 18 '11 at 21:21
  • 1
    Why are people downvoting this and upvoting Mark Byers' answer, when they're essentially the same answer? – Joe White Oct 18 '11 at 21:21
  • Because I m on the dark side. i guess that s why. – DarthVader Oct 18 '11 at 21:22
  • 1
    I downvoted because, while perhaps originally discussed on Joshua Bloch's book, the C# code presented is clearly Jon Skeet's implementation (including the comment), which was *not* credited. Mark Byers' answer *DID* credit the code source. – Jesse C. Slicer Oct 18 '11 at 21:27
  • 1
    Now there you go. Removing my downvote. Next time, don't make Obi-Wan more powerful than we could possibly imagine. – Jesse C. Slicer Oct 18 '11 at 21:32
0

Combining Jon Skeet's answer and comment on the question (so please, no voting on this, just consolidating):

struct DateRange
{
    private readonly DateTime start;

    private readonly DateTime end;

    public DateRange(DateTime start, DateTime end)
    {
        this.start = start.Date;
        this.end = end.Date;
    }

    public DateTime Start
    {
        get
        {
            return this.start;
        }
    }

    public DateTime End
    {
        get
        {
            return this.end;
        }
    }

    public static bool operator ==(DateRange dateRange1, DateRange dateRange2)
    {
        return dateRange1.Equals(dateRange2);
    }

    public static bool operator !=(DateRange dateRange1, DateRange dateRange2)
    {
        return !dateRange1.Equals(dateRange2);
    }

    public override int GetHashCode()
    {
        // Overflow is fine, just wrap
        unchecked
        {
            var hash = 17;

            // Suitable nullity checks etc, of course :)
            hash = (23 * hash) + this.start.GetHashCode();
            hash = (23 * hash) + this.end.GetHashCode();
            return hash;
        }
    }

    public override bool Equals(object obj)
    {
        return (obj is DateRange)
            && this.start.Equals(((DateRange)obj).Start)
            && this.end.Equals(((DateRange)obj).End);
    }
}
Community
  • 1
  • 1
Jesse C. Slicer
  • 19,901
  • 3
  • 68
  • 87