1

I want to use the HashSet.Contains method because its super fast.

var hashset = new HashSet<Customer>(customers, new CustomerComparer());
var found = hashset.Contains(new Customer{ Id = "1234", Name = "mit" }); // "mit" instead of an equals "smith" in the comparer.

I am searching for multiple properties on the customer object.

I have to implement the IEqualityComparer interface like:

public class CustomerComparer : IEqualityComparer<Customer>
{
    public bool Equals(Customer x, Customer y)
    {
        return x.Id == y.Id && x.Name.Contains(y.Name);
    }      

    public int GetHashCode(Customer obj)
    {
        return obj.Id.GetHashCode() ^ obj.Name.GetHashCode();
    }
}

Why is the Equals method never hit when I do NOT use an Equals method inside the CustomerComparer Equals method like the .Contains?

Elisabeth
  • 20,496
  • 52
  • 200
  • 321

2 Answers2

5

The way you have implemented the equality comparer can not work properly. The reason is how the hash set and the equality comparer work internally. When a Dictionary or a HashSet does a comparison of items, it will first call GetHashCode on both items. Only if these hash codes match, it will confirm the exact match with a subsequent call to Equals to avoid false matches in case of a hash code collision. If you use your example, (x.Name = "smith" and y.Name = "mit"), the GetHashCode method will return different hash codes for each item and Equals is never called.

The solution in this case is to only use the Id for hash code creation. That would degrade performance a bit because you would have to call Equals more often to resolve collision, but that's the price you have to pay:

public int GetHashCode(Customer obj)
{
    return obj.Id.GetHashCode() ;
}

What you should also consider that you have no guarantee whether your existing item will be x or y. So you have to use Contains in both directions:

public bool Equals(Customer x, Customer y)
{
    return x.Id == y.Id && (x.Name.Contains(y.Name) || y.Name.Contains(x.Name));
}      
Sefe
  • 13,731
  • 5
  • 42
  • 55
  • Sorry Sefe there is a misunderstanding. I am calling hashset.contains, that code is just not posted. I update my question! – Elisabeth Dec 09 '16 at 10:21
  • I know my sample question might look stupid but: How would you call the hashset.Contains(new Customer { Id = item.Id, Name = "mit" }); concerning your solution, if the Name property must be value "mit" or "test" would you call the hashset.Contains() 2 TIMES ? one time with "mit" and other time with "test" value ? And if one of the contains returns true then I can do further business logic... – Elisabeth Dec 09 '16 at 11:04
  • @Elisabeth: That woul be the easiest way to go. If you want to handle this with one call, the solution I can think of is to create a new class that allows to specify an ID and a list of names (e.g. named `CustomerKey`). You would then implement the `IEquatable` interface on your `Customer` class. There you can implement your comparison that allows a list of names. – Sefe Dec 09 '16 at 11:09
  • I did that and updated my question with your code suggestion, but the Equals method from IEquatable is never hit and therefore my returned customers are way too many ;-) But I already know the game and override the GetHashCode with the Names Array/HashSet as Marc Gravell suggest here: http://stackoverflow.com/questions/638761/gethashcode-override-of-object-containing-generic-array BUT do you really advise me to use the IEqualityComparer on the HashSet and the IEquatable on the Customer to fullfill my requirements? – Elisabeth Dec 09 '16 at 11:48
  • @Elisabeth: You won't be able to call `Contains` for `IEquatable` on a `HashSet`. You would need a `HashSet>. I would suggest you just return the hash code of `ID` (as done in my answer) if you want to match partial names. If I can advise it is a different question. If you don't need the performance, multiple searches in the hash set with `Customer` objects is simpler and will do the job. If you need to maximise performance, go for `IEquatable`. – Sefe Dec 09 '16 at 12:55
0

Why is the Equals method never hit when I do NOT use an Equals method inside the CustomerComparer Equals method like the .Contains?

The Equals method will be hit only if there are at least one item present in your "customers" collection that has the same hash code as the Customer object that you pass to the Contains method of the HashSet. If you run the following sample program you will see that the Equals method does get hit:

public static class Program
{
    public class Customer
    {
        public string Id { get; set; }
        public string Name { get; set; }
    }

    public class CustomerComparer : IEqualityComparer<Customer>
    {
        public bool Equals(Customer x, Customer y)
        {
            Console.WriteLine("hit!");
            return x.Id == y.Id && x.Name.Contains(y.Name);
        }

        public int GetHashCode(Customer obj)
        {
            return obj.Id.GetHashCode() ^ obj.Name.GetHashCode();
        }
    }

    public static void Main()
    {
        List<Customer> customers = new List<Customer>()
        {
            new Customer() { Id = "1234", Name = "smith" },
            new Customer() { Id = "1234", Name = "mit" }
        };
        var hashset = new HashSet<Customer>(customers, new CustomerComparer());
        var found = hashset.Contains(new Customer { Id = "1234", Name = "mit" }); // "mit" instead of an equals "smith" in the comparer.
        Console.WriteLine(found); // = true
    }
}

But if you remove the second item from the "customers" list the Equals method won't be hit since the hash code of the "smith" customer in the List has a different hash code than the "smith" customer that you pass to the Contains method:

List<Customer> customers = new List<Customer>()
        {
            new Customer() { Id = "1234", Name = "smith" }
        };
mm8
  • 163,881
  • 10
  • 57
  • 88