53

Here's what I'm trying to do. I'm querying an XML file using LINQ to XML, which gives me an IEnumerable<T> object, where T is my "Village" class, filled with the results of this query. Some results are duplicated, so I would like to perform a Distinct() on the IEnumerable object, like so:

public IEnumerable<Village> GetAllAlliances()
{
    try
    {
        IEnumerable<Village> alliances =
             from alliance in xmlDoc.Elements("Village")
             where alliance.Element("AllianceName").Value != String.Empty
             orderby alliance.Element("AllianceName").Value
             select new Village
             {
                 AllianceName = alliance.Element("AllianceName").Value
             };

        // TODO: make it work...
        return alliances.Distinct(new AllianceComparer());
    }
    catch (Exception ex)
    {
        throw new Exception("GetAllAlliances", ex);
    }
}

As the default comparer would not work for the Village object, I implemented a custom one, as seen here in the AllianceComparer class:

public class AllianceComparer : IEqualityComparer<Village>
{
    #region IEqualityComparer<Village> Members
    bool IEqualityComparer<Village>.Equals(Village x, Village y)
    {
        // Check whether the compared objects reference the same data.
        if (Object.ReferenceEquals(x, y)) 
            return true;

        // Check whether any of the compared objects is null.
        if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
            return false;

        return x.AllianceName == y.AllianceName;
    }

    int IEqualityComparer<Village>.GetHashCode(Village obj)
    {
        return obj.GetHashCode();
    }
    #endregion
}

The Distinct() method doesn't work, as I have exactly the same number of results with or without it. Another thing, and I don't know if it's usually possible, but I cannot step into AllianceComparer.Equals() to see what could be the problem.
I've found examples of this on the Internet, but I can't seem to make my implementation work.

Hopefully, someone here might see what could be wrong here! Thanks in advance!

Fueled
  • 8,776
  • 9
  • 29
  • 31
  • Your catch/throw construct makes it where the calling function can no longer choose to catch(ArgumentException) or catch(IOException) (examples). For this case, you're probably better off removing the try/catch all together - besides, the name of the method will be part of the exception StackTrace property. – Sam Harwell Dec 16 '09 at 20:22

2 Answers2

73

The problem is with your GetHashCode. You should alter it to return the hash code of AllianceName instead.

int IEqualityComparer<Village>.GetHashCode(Village obj)
{
    return obj.AllianceName.GetHashCode();
}

The thing is, if Equals returns true, the objects should have the same hash code which is not the case for different Village objects with same AllianceName. Since Distinct works by building a hash table internally, you'll end up with equal objects that won't be matched at all due to different hash codes.

Similarly, to compare two files, if the hash of two files are not the same, you don't need to check the files themselves at all. They will be different. Otherwise, you'll continue to check to see if they are really the same or not. That's exactly what the hash table that Distinct uses behaves.

Mehrdad Afshari
  • 414,610
  • 91
  • 852
  • 789
  • Why cant we just override the typical equality methods for Distinct?? – Boog Mar 24 '13 at 01:01
  • @Boog Of course, you could do this in the object itself, or a separate equality comparer class, as the OP wants. The use case for a custom equality comparer is when you have different ways to consider objects equal or not (e.g. case sensitive and case insensitive comparison for strings) or when you cannot change the class itself for any reason. In any case, you should override both `Equals` and `GetHashCode` and write a proper `GetHashCode` with respect to the `Equals` method. – Mehrdad Afshari Mar 24 '13 at 05:54
7

Or change the line

return alliances.Distinct(new AllianceComparer());

to

return alliances.Select(v => v.AllianceName).Distinct();
Jacob Seleznev
  • 8,013
  • 3
  • 24
  • 34
  • 1
    This is an interesting alternative approach which would cause the asker of the question to rethink the code which is using the IEqualityComparer. This solves a problem for me, which is why I like it. It doesn't answer the question, though. In my case, I wanted to extract unique keys from a collection of key-value data, and store the unique keys separately in a List. As @superrcat mentioned, the generic type of the returned IEnumerable is changed by doing this. – Greg Apr 29 '15 at 14:10
  • `return alliances.Select(v => v.AllianceName).Distinct();` That would return an `IEnumerable` instead of `IEnumerable`. – superrcat Dec 16 '09 at 20:16