Get distinct List by properties of U of the list T.X?

Question

Given the following the following code.

class T { 
    public List<U> X { get; set; } /*.....*/ 
}
class U { 
    public int A { get; set; }
    public int B { get; set; } 
    // other properties omit for easier testing
}

var l = new List<T> {
  new T { X = new List<U> { new U { A = 0, B = 9 }, new U { A = 1, B = 8 } } },
  new T { X = new List<U> { new U { A = 0, B = 9 }, new U { A = 1, B = 8 } } },
  new T { X = new List<U> { new U { A = 2, B = 4 }, new U { A = 3, B = 5 } } },
  new T { X = new List<U> { new U { A = 2, B = 4 }, new U { A = 3, B = 5 } } },
  // ......
};

What's the most concise way to get the distinct List<T> of the l? The return is expected to have two sub-lists which X has [{0,9}, {1,8}] and [{2,4}, {3,5}].

Updated code based on comments to Enigmativity's answer:

interface IBase<I> { I A { get; set; } I B { get; set; } }
class T<I> { 
    public List<U<I>> X { get; set; } /*.....*/ 
}
class U<I> : IBase<I> { 
    public I A { get; set; }
    public I B { get; set; } 
    // other properties omit for easier testing
}

var l = new List<T<int>> {
  new T<int> { X = new List<U<int>> { new U<int> { A=0, B=9 }, new U<int> { A=1, B=8 } } },
  new T<int> { X = new List<U<int>> { new U<int> { A=0, B=9 }, new U<int> { A=1, B=8 } } },
  new T<int> { X = new List<U<int>> { new U<int> { A=2, B=4 }, new U<int> { A=3, B=5 } } },
  new T<int> { X = new List<U<int>> { new U<int> { A=2, B=4 }, new U<int> { A=3, B=5 } } },
  // ......
};

Updated sample data as per comments:

var l = new List<T> {
  new T { X = new List<U> { new U { A = 0, B = 9 }, new U { A = 1, B = 8 } } },
  new T { X = new List<U> { new U { A = 0, B = 9 }, new U { A = 1, B = 8 } } },
  new T { X = new List<U> { new U { A = 2, B = 4 }, new U { A = 3, B = 5 } } },
  new T { X = new List<U> { new U { A = 2, B = 4 }, new U { A = 3, B = 5 } } },
  new T { X = new List<U> {} }
  // ......
};

I don't know if it is the most consise way, but I would just make a custom class that implements `IEqualityComparer>` and pass that in to a LINQ `.Distinct(IEqualityComparer)` call. — Scott Chamberlain, Feb 22 '16 at 22:54
Do you consider making a special `DistinctBy(` extension method too verbose? — Scott Chamberlain, Feb 22 '16 at 23:00
It would be nice if there is a distinct function accepting a lambda `(x, y) => .....`. — ca9163d9, Feb 22 '16 at 23:02
Possible duplicate of [Distinct() with lambda?](http://stackoverflow.com/questions/1300088/distinct-with-lambda) — David Ferenczy Rogožan, Feb 22 '16 at 23:04

Enigmativity · Accepted Answer · 2016-02-24T07:22:44.280

3

For your given code the quickest way is to implement an IEqualityComparer<T> and use that in the standard LINQ .Distinct operator.

public class TEqualityComparer : IEqualityComparer<T>
{
    public bool Equals(T t1, T t2)
    {
        if (t2 == null && t1 == null)
            return true;
        else if (t1 == null || t2 == null)
            return false;
        else
        {
            return
                t1.X.Select(x => x.A).SequenceEqual(t2.X.Select(x => x.A))
                && t1.X.Select(x => x.B).SequenceEqual(t2.X.Select(x => x.B));
        }
    }

    public int GetHashCode(T t)
    {
        return t.X.Select(x => x.A.GetHashCode())
            .Concat(t.X.Select(x => x.B.GetHashCode()))
            .Aggregate((x1, x2) => (x1 * 17 + 13) ^ x2);
    }
}

Then you can do this:

IEnumerable<T> result = l.Distinct(new TEqualityComparer());

Which gives you:

But you want the result as a List<List<U>> so then you'd do this:

List<List<U>> result =
    l.Distinct(new TEqualityComparer())
        .Select(t => t.X.ToList())
        .ToList();

Based on your updated code, this is what you need:

public class TEqualityComparer<V> : IEqualityComparer<T<V>>
{
    public bool Equals(T<V> t1, T<V> t2)
    {
        if (t2 == null && t1 == null)
            return true;
        else if (t1 == null || t2 == null)
            return false;
        else
        {
            return
                t1.X.Select(x => x.A).SequenceEqual(t2.X.Select(x => x.A))
                && t1.X.Select(x => x.B).SequenceEqual(t2.X.Select(x => x.B));
        }
    }

    public int GetHashCode(T<V> t)
    {
        return t.X.Select(x => x.A.GetHashCode())
            .Concat(t.X.Select(x => x.B.GetHashCode()))
            .Aggregate((x1, x2) => (x1 * 17 + 13) ^ x2);
    }
}

You'd call it like:

IEnumerable<T<int>> result = l.Distinct(new TEqualityComparer<int>());

...or:

List<List<U<int>>> result =
    l.Distinct(new TEqualityComparer<int>())
        .Select(t => t.X.ToList())
        .ToList();

With the updated data all you need to do to make this work now is to change GetHashCode to this:

public int GetHashCode(T<V> t)
{
    return t.X.Select(x => x.A.GetHashCode())
        .Concat(t.X.Select(x => x.B.GetHashCode()))
        .DefaultIfEmpty(0)
        .Aggregate((x1, x2) => (x1 * 17 + 13) ^ x2);
}

The data you added was for the old classes. I updated it to this:

var l = new List<T<int>> {
  new T<int> { X = new List<U<int>> { new U<int> { A=0, B=9 }, new U<int> { A=1, B=8 } } },
  new T<int> { X = new List<U<int>> { new U<int> { A=0, B=9 }, new U<int> { A=1, B=8 } } },
  new T<int> { X = new List<U<int>> { new U<int> { A=2, B=4 }, new U<int> { A=3, B=5 } } },
  new T<int> { X = new List<U<int>> { new U<int> { A=2, B=4 }, new U<int> { A=3, B=5 } } },
  new T<int> { X = new List<U<int>> { } },
  // ......
};

edited Feb 24 '16 at 07:22

answered Feb 22 '16 at 23:39

Enigmativity

113,464
11
89
172

Is the GetHarshCode used? How is the reliability of the calculation of it's used?. – ca9163d9 Feb 22 '16 at 23:52
Yes, `GetHashCode` is used. It is computed first to determine if the two `T` instances _might_ be equal, and if so then `Equals` is called to confirm. – Enigmativity Feb 22 '16 at 23:58
@dc7a9163d9 - And what do you mean about the reliability? – Enigmativity Feb 22 '16 at 23:58
About the collisions. It looks good just not sure if it's the usual formula to do it. – ca9163d9 Feb 23 '16 at 00:25
@dc7a9163d9 - There is not really such a thing as the "usual" function to do it. Each class is different and creating a good hash depends on the type. In your case because of the structure of `T` you need to break down the object in a very convoluted way. – Enigmativity Feb 23 '16 at 00:30
I tried the code in VS2015 C# interactive window. There are two issues, (1) The class should have type parameter `T` (class TEqalityComparer : IEqualityComparer`. (2) the compile doesn't know `t1` and `t2` have property `X`. – ca9163d9 Feb 23 '16 at 05:54
@dc7a9163d9 - No, it doesn't have a type parameter. Your type is called `T` so `public class TEqualityComparer : IEqualityComparer` works perfectly fine. I tested my code before posting it. – Enigmativity Feb 23 '16 at 06:03
Sorry my bad. I confused it with my working code which uses `T` as a type parameter. I was thinking a generic comparer. I will update the question. – ca9163d9 Feb 23 '16 at 06:07
@dc7a9163d9 - Please don't update questions like that. If you are substantively changing your question you should either append it to the question or start a new question. Do not invalidate existing answers. – Enigmativity Feb 23 '16 at 06:51
The code has an issue - it won't work there exists a `T.X` with zero element. Or it's null. – ca9163d9 Feb 23 '16 at 17:51
@dc7a9163d9 - Your data didn't have those conditions. You need to make sure that you provide a good sample of data that you need for your real world work. Do you need me to fix it for you? If so, can you post the data that doesn't work to the end of your question? – Enigmativity Feb 23 '16 at 21:03
Thanks. I just need to filter these elements out before call the Distinct method. Curious, the result has a summary row for each returned list. Why it happens? Is it a Linqpad thing? – ca9163d9 Feb 23 '16 at 21:22
@dc7a9163d9 - Yes, that's LINQPad's visualization of the data. – Enigmativity Feb 23 '16 at 23:12
@dc7a9163d9 Are you going to update your question with the sample data I asked for? – Enigmativity Feb 24 '16 at 04:39
Yes, the question had been updated (the first one was added `new T { X = new List {} }`) – ca9163d9 Feb 24 '16 at 04:54
@dc7a9163d9 - Please stop editing your question so that it invalidates existing answers. You need to append new information. – Enigmativity Feb 24 '16 at 07:17

Get distinct List by properties of U of the list T.X?

1 Answers1