11

I would like to get distinct objects from a list. I tried to implement IEqualityComparer but wasn't successful. Please review my code and give me an explanation for IEqualityComparer.

public class Message
{
    public int x { get; set; }
    public string y { get; set; }
    public string z { get; set; }
    public string w { get; set; }
}

public class MessageComparer : IEqualityComparer<Message>
{
    public bool Equals(Message x, Message y)
    {
        if (Object.ReferenceEquals(x, y)) return true;

        if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
            return false;

        if (x.x == y.x && x.y == y.y && x.z == y.z && x.w == y.w)
        {
            return true;
        }

        return false;
    }

    public int GetHashCode(Message number)
    {
        // if (Object.ReferenceEquals(number, null)) return 0;
        int hashX = number.x.GetHashCode();
        int hashY = number.y == null ? 0 : number.y.GetHashCode();
        int hashZ = number.z == null ? 0 : number.z.GetHashCode();
        int hashW = number.w == null ? 0 : number.w.GetHashCode();

        return hashX ^ hashY ^ hashZ ^ hashW;           
    }
}

This is my List with Message objects:

Message m1 = new Message();
m1.x = 1;
m1.y = "A";
m1.z = "B";
m1.w = "C";

Message m2 = new Message();
m2.x = 1;
m2.y = "A";
m2.z = "B";
m2.w = "C";

Message m3 = new Message();
m3.x = 1;
m3.y = "A";
m3.z = "B";
m3.w = "C";

Message m4 = new Message();
m4.x = 2;
m4.y = "A";
m4.z = "B";
m4.w = "C";

Message m5 = new Message();
m5.x = 3;
m5.y = "W";
m5.z = "D";
m5.w = "C";

Message m6 = new Message();
m6.x = 4;
m6.y = "S";
m6.z = "F";
m6.w = "R";

List<Message> collection = new List<Message>();
collection.Add(m1);
collection.Add(m2);
collection.Add(m3);
collection.Add(m4);
collection.Add(m5);

collection.Distinct(new MessageComparer());

When I call the Distinct() method, the number of elements in collection are the same.

Lauren Rutledge
  • 1,195
  • 5
  • 18
  • 27
kat1330
  • 5,134
  • 7
  • 38
  • 61
  • 2
    Are you assigning the result of the distinct call to a variable or just checking the original list? If you're not assigning to anything, you'll need to. `Distinct()` returns an `IEnumerable` rather than doing an in place update – eouw0o83hf Jul 24 '14 at 02:20

3 Answers3

13

Try this:

var distinct = collection.Distinct(new MessageComparer());

Then use distinct for anything after that.

It looks like you're forgetting the immutable nature of IEnumerable<>. None of the LINQ methods actually change the original variable. Rather, they return IEnuerable<T>s which contain the result of the expression. For example, let's consider a simple List<string> original with the contents { "a", "a", "b", "c" }.

Now, let's call original.Add("d");. That method has no return value (it's void). But if we then print out the contents of original, we will see { "a", "a", "b", "c", "d" }.

On the other hand, let's now call original.Skip(1). This method does have a return value, one of type IEnumerable<string>. It is a LINQ expression, and performs no side-effecting actions on the original collection. Thus, if we call that and look at original, we will see { "a", "a", "b", "c", "d" }. However, the result from the method will be { "a", "b", "c", "d" }. As you can see, the result skips one element.

This is because LINQ methods accept IEnumerable<T> as a parameter. Consequently, they have no concept of the implementation of the original list. You could be passing, via extension method, a ReadOnlyCollection and they would still be able to evaluate through it. They cannot, then, alter the original collection, because the original collection could be written in any number of ways.

All that, but in table form. Each lines starts with the original { "a", "a", "b", "c" }:

Context     Example function    Immutable?    Returned Value     Collection after calling
Collection  Add("d")            No            (void)             { "a", "a", "b", "c", "d" }:
LINQ        Skip(1)             Yes           { "a", "b", "c" }  { "a", "a", "b", "c" }:
Mafii
  • 7,227
  • 1
  • 35
  • 55
Matthew Haugen
  • 12,916
  • 5
  • 38
  • 54
10

IEqualityComparer is an interface which is used to find whether an object is equal or not. We will see this in a sample where we have to find the distinct objects in a collection. This interface will implement the method Equals(T obj1,T obj2).

abstract public class Person
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public string Address { set; get; }
}

public enum SortType
{
    ByID,
    BySalary
}

public class EmployeeDistinctEquality : IEqualityComparer<Employee>
{
    public EmployeeDistinctEquality()
    {

    }

    public bool Equals(Employee x, Employee y)
    {
        if (x == null && y == null)
            return true;
        else if (x == null || y == null)
            return false;
        else if (x.Id == y.Id)
            return true;
        else
            return false;
    }

    public int GetHashCode(Employee obj)
    {
        return obj.Id.GetHashCode();
    }
}

Refer to this link for more detailed information:

http://dotnetvisio.blogspot.in/2015/12/usage-of-icomparer-icomparable-and.html

Andre Kampling
  • 5,476
  • 2
  • 20
  • 47
Rajesh G
  • 141
  • 1
  • 3
3

You don't need to implement IEqualityComparer:

public class Message
{
    protected bool Equals(Message other)
    {
        return string.Equals(x, other.x) && string.Equals(y, other.y) && string.Equals(z, other.z) && string.Equals(w, other.w);
    }

    public override bool Equals(object obj)
    {
        if (ReferenceEquals(null, obj)) return false;
        if (ReferenceEquals(this, obj)) return true;
        if (obj.GetType() != this.GetType()) return false;
        return Equals((Message) obj);
    }

    public override int GetHashCode()
    {
        unchecked //Ignores overflows that can (should) occur
        {
            var hashCode = x;
            hashCode = (hashCode*397) ^ (y != null ? y.GetHashCode() : 0);
            hashCode = (hashCode*397) ^ (z != null ? z.GetHashCode() : 0);
            hashCode = (hashCode*397) ^ (w != null ? w.GetHashCode() : 0);
            return hashCode;
        }
    }

    public int x { get; set; }
    public string y { get; set; }
    public string z { get; set; }
    public string w { get; set; }
}
Lauren Rutledge
  • 1,195
  • 5
  • 18
  • 27
Selali Adobor
  • 2,060
  • 18
  • 30
  • 3
    Depends on the case. If this kind of comparison is just needed in a specific Distinct case, it's better to use a separate comparer. Overriding Equals method affects other places, too. – Riikka Heikniemi Nov 03 '17 at 11:13
  • 2
    It affects other places in a positive way, you generally want to implement it for a data class like this – Selali Adobor Nov 04 '17 at 14:03
  • I think that's the best answer to the question – Vic Jun 08 '21 at 07:16