2

I have a very simple test method that returns a List that has a number of duplicates, but when it did not I thought I'd try HashSet as that should remove duplicates, but it appears I need to override the Equals and GetHashCode but I am really struggling to understand what I need to do. I would appreciate some pointers please.

HashSet<object> test = XmlManager.PeriodHashSet(Server.MapPath("../Xml/XmlFile.xml"));
foreach (Object period in test2)
{
    PeriodData pd = period as PeriodData;
    Response.Write(pd.PeriodName + "<br>");
}

I also tried it with the following

List<object> test = XmlManager.PeriodList(Server.MapPath("../Xml/XmlFile.xml"));
List<object> test2 = test.Distinct().ToList();
foreach (Object period in test2)
{
    PeriodData pd = period as PeriodData;
    Response.Write(pd.PeriodName + "<br>");
}

The PeriodData objuect is delcarewd as follows:

public class PeriodData
{
    private int m_StartYear = -9999999;
    private int m_EndYear = -9999999;
    private string m_PeriodName = String.Empty;

    public int StartYear
    {
        get { return m_StartYear; }
        set { m_StartYear = value; }
    }
    public int EndYear
    {
        get { return m_EndYear; }
        set { m_EndYear = value; }
    }
    public string PeriodName
    {
        get { return m_PeriodName; }
        set { m_PeriodName = value; }
    }
}

It is the returned PeriodName I want to remove the duplicate for.

cuongle
  • 74,024
  • 28
  • 151
  • 206
Peter C
  • 553
  • 2
  • 7
  • 19
  • See this stackflow question gives you a couple option for a distinct on a list of objects. http://stackoverflow.com/questions/1300088/distinct-with-lambda – cgotberg Apr 16 '13 at 17:30

3 Answers3

4

For the HashSet<T> to work, you need to, at a minimum, override Object.Equals and Object.GetHashCode. This is what allows the hashing algorithm to know what makes two objects "distinct" or the same by values.

In terms of simplifying and improving the code, there are two major changes I'd recommend to make this work:

First, you should use HashSet<PeriodData> (or List<PeriodData>), not HashSet<object>.

Second, your PeriodData class should implement IEquatable<PeriodData> in order to provide proper hashing and equality.

Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
  • But actually, while the suggestions you make would improve the code, the problem could be solved simply by overriding Equals and GetHashCode, without changing the type argument, and without implementing `IEquatable<>`. – phoog Apr 16 '13 at 17:38
  • I take the point on improvements, but it is the overrides and how to write them I do not understand. – Peter C Apr 16 '13 at 18:16
  • Thanks @ReedCopsey, I took your points and others and got it working. – Peter C Apr 16 '13 at 19:45
1

You have to decide what makes two periods equal. If all three properties have to be the same for two periods to be equal, then you can implement Equals thus:

public override bool Equals(object obj)
{
    if (ReferenceEquals(null, obj)) return false;
    if (ReferenceEquals(this, obj)) return true;
    if (obj.GetType() != this.GetType()) return false;
    PeriodData other = (PeriodData)obj;
    return m_StartYear == other.m_StartYear && m_EndYear == other.m_EndYear && string.Equals(m_PeriodName, other.m_PeriodName);
}

For GetHashCode, you could do something like this:

    public override int GetHashCode()
    {
        return (((m_StartYear * 397) ^ m_EndYear) * 397) ^ m_PeriodName.GetHashCode();
    }

(Credit where it is due: these are adapted from the code generated by ReSharper's code generation tool.)

As others have noted, it would be better to implement IEquatable<T> as well.

If you cannot modify the class, or you do not want to modify it, you can put the equality comparison logic in another class that implements IEqualityComparer<PeriodData, which you can pass to the appropriate constructor of HashSet<PeriodData> and Enumerable.Distinct()

phoog
  • 42,068
  • 6
  • 79
  • 117
  • That's great, it worked and does the job. I only needed to check that PeriodName was equal so modified to suit. I have not implemented the IEquitable as that is even harder for me as a relative dummie programmer to understand. I've not found a simple explanation for it so I'll leave it for now. – Peter C Apr 16 '13 at 19:38
0

You have to implement IEquatable<T> to make Distinct() work.

How would the framework know how to say "those two objects are identical" if you don't? You have to provide the framework a way to compare your objects, that's the purpose of the IEquatable<T> implementation.

ken2k
  • 48,145
  • 10
  • 116
  • 176
  • Overriding Equals and GetHashCode would solve the problem. It is a good idea to implement `IEquatable<>`, but it is not necessary. – phoog Apr 16 '13 at 17:39