0

I have the following custom class deriving from Tuple:

public class CustomTuple : Tuple<List<string>, DateTime?>
{
  public CustomTuple(IEnumerable<string> strings, DateTime? time)
      : base(strings.OrderBy(x => x).ToList(), time)
  {
  }
}

and a HashSet<CustomTuple>. The problem is that when I add items to the set, they are not recognised as duplicates. i.e. this outputs 2, but it should output 1:

void Main()
{
    HashSet<CustomTuple> set = new HashSet<CustomTuple>();

    var a = new CustomTuple(new List<string>(), new DateTime?());
    var b = new CustomTuple(new List<string>(), new DateTime?());

    set.Add(a);
    set.Add(b);

    Console.Write(set.Count); // Outputs 2
}

How can I override the Equals and GetHashCode methods to cause this code to output a set count of 1?

rleelr
  • 1,854
  • 17
  • 26
  • 2
    You override them so that when they are supposed to be equals, the method returns true. – Euphoric Aug 03 '15 at 15:11
  • I know that - but what I have attempted doesn't work. – rleelr Aug 03 '15 at 15:11
  • What do you consider equal? All the strings in the list match and the date matches? So do that in your `equals` method. (I'd check the date first to be more efficient). – Matt Burland Aug 03 '15 at 15:12
  • Yes that's what I consider equal. – rleelr Aug 03 '15 at 15:12
  • 1
    So where's your code? Note that the result of `GetHashCode` has to match before `Equals` will even be called. – Matt Burland Aug 03 '15 at 15:14
  • So what have you tried? First, for tuple to be equal, all components must be equal. So this breaks down to equality of List and DateTime?. And from simple search, it should be obvious List only has reference equality and not value equality. I'm sure there is question like that around here. – Euphoric Aug 03 '15 at 15:14
  • Thanks, it's probably the reference not value equality of list that is stopping it. I'll try that out now. – rleelr Aug 03 '15 at 15:22

3 Answers3

1

You should override GetHashCode and Equals virtual methods defined in System.Object class.

Please remember that:

  • If two objects are logically "equal" then they MUST have the same hash code!

  • If two objects have the same hashcode, then it is not mandatory to have your objects equal.

Also, i've noticed an architectural problem in your code: List is a mutable type but overriding Equals and GetHashCode usually makes your class logically to behave like a value type. So having "Item1" a mutable type and behaving like a value type is very dangerous. I suggest replacing your List with a ReadOnlyCollection . Then you would have to make a method that checks whether two ReadOnlyCollections are Equal.

For the GetHashCode () method, just compose a string from all string items found in Item1 then append a string that represents the Hash code for datetime then finally call on the concatenated result the "GetHashCode ()" overrided on string method. So normally you would have:

override int GetHashCode () {

  return (GetHashCodeForList (Item1) + (Item2 ?? DateTime.MinValue).GetHashCode ()).GetHashCode ();
}

And the GetHashCodeForList method would be something like this:

private string GetHashCodeForList (IEnumerable <string> lst) {
      if (lst == null) return string.Empty;
      StringBuilder sb = new StringBuilder ();

      foreach (var item in lst) {
         sb.Append (item);
      }
      return sb.ToString ();
}

Final note: You could cache the GetHashCode result since it is relative expensive to get and your entire class would became immutable (if you replace List with a readonly collection).

George Lica
  • 1,798
  • 1
  • 12
  • 23
  • +1 for the point of mutability. They copy the list in the constructor, but since it's a `List` you can always cast it back and add more items. For example `((List)a.Item1).Add("foo");` which is something you shouldn't be able to do. – Matt Burland Aug 03 '15 at 15:40
1

A HashSet<T> will first call GetHashCode, so you need to work on that first. For an implementation, see this answer: https://stackoverflow.com/a/263416/1250301

So a simple, naive, implementation might look like this:

public override int GetHashCode()
{
    unchecked
    {
        int hash = 17;
        hash = hash * 23 + this.Item2.GetHashCode();
        foreach (var s in this.Item1)
        {
            hash = hash * 23 + s.GetHashCode();
        }
        return hash;
    }
}

However, if your lists are long, then this might not be efficient enough. So you'll have to decide where to compromise depending on how tolerant you are of collisions.

If the result of GetHashCode for two items are the same, then, and only then, will it call Equals. An implementation of Equals is going to need to compare the items in the list. Something like this:

public override bool Equals(object o1)
{
    var o = o1 as CustomTuple;
    if (o == null)
    {
        return false;
    }
    if (Item2 != o.Item2) 
    {
        return false;
    }
    if (Item1.Count() != o.Item1.Count())
    {
        return false;
    }
    for (int i=0; i < Item1.Count(); i++)
    {
        if (Item1[i] != o.Item1[i])
        {
            return false;
        }
    }
    return true;
}

Note that we check the date (Item2) first, because that's cheap. If the date isn't the same, we don't bother with anything else. Next we check the Count on both collections (Item1). If they don't match, there's no point iterating the collections. Then we loop through both collections and compare each item. Once we find one that doesn't match, we return false because there is no point continuing to look.

As pointed out in George's answer, you also have the problem that your list is mutable, which will cause problems with your HashSet, for example:

var a = new CustomTuple(new List<string>() {"foo"} , new DateTime?());
var b = new CustomTuple(new List<string>(), new DateTime?());

set.Add(a);
set.Add(b);

// Hashset now has two entries

((List<string>)a.Item1).Add("foo");

// Hashset still has two entries, but they are now identical.

To solve that, you need to force your IEnumerable<string> to be readonly. You could do something like:

public class CustomTuple : Tuple<IReadOnlyList<string>, DateTime?>
{
    public CustomTuple(IEnumerable<string> strings, DateTime? time)
      : base(strings.OrderBy(x => x).ToList().AsReadOnly(), time)
    {
    }

    public override bool Equals(object o1)
    {
        // as above
    }

    public override int GetHashCode()
    {
        // as above
    }

}
Community
  • 1
  • 1
Matt Burland
  • 44,552
  • 18
  • 99
  • 171
0

This is is what I went for, which outputs 1 as desired:

private class CustomTuple : Tuple<List<string>, DateTime?>
{
  public CustomTuple(IEnumerable<string> strings, DateTime? time)
        : base(strings.OrderBy(x => x).ToList(), time)
    {
    }

  public override bool Equals(object obj)
  {
      if (obj == null || GetType() != obj.GetType())
      {
          return false;
      }

      var that = (CustomTuple) obj;

      if (Item1 == null && that.Item1 != null || Item1 != null && that.Item1 == null) return false;
      if (Item2 == null && that.Item2 != null || Item2 != null && that.Item2 == null) return false;

      if (!Item2.Equals(that.Item2)) return false;
      if (that.Item1.Count != Item1.Count) return false;
      for (int i = 0; i < Item1.Count; i++)
      {
          if (!Item1[i].Equals(that.Item1[i])) return false;
      }

      return true;
  }

  public override int GetHashCode()
  {
      int hash = 17;
      hash = hash*23 + Item2.GetHashCode();
      return Item1.Aggregate(hash, (current, s) => current*23 + s.GetHashCode());
  }
}
rleelr
  • 1,854
  • 17
  • 26