0

I have a collection of objects (which map to rows in a DB using linq to SQL). I want to de-duplicate these objects based on some of their properties. Can I do this with a linq to sql query?

For instance, if I have a collection of Students with properties name, date of birth, ssn and field of study how do I select distinct students from that list based on name, date of birth and ssn (but not field of study). Is there an elegant way to do this with LINQ? If not, is there another elegant method?

bernie2436
  • 22,841
  • 49
  • 151
  • 244
  • Check out http://stackoverflow.com/questions/2537823/distinct-by-property-of-class-by-linq. Now if you group on multiple columns you'll be able to get what you need. But bear in mind, this action is likely going to be handled in memory rather than on the database. – Yuriy Faktorovich Jun 10 '13 at 16:42

2 Answers2

1

You can use Distinct and a custom IEqualityComparer. For example, here's one I'm quite fond of:

public class PropertyEqualityComparer<TObject, TProperty> 
    : IEqualityComparer<TObject>
{
    Func<TObject, TProperty> _selector;
    IEqualityComparer<TProperty> _internalComparer;
    public PropertyEqualityComparer(Func<TObject, TProperty> propertySelector,
        IEqualityComparer<TProperty> innerEqualityComparer = null)
    {
        _selector = propertySelector;
        _internalComparer = innerEqualityComparer;
    }
    public int GetHashCode(TObject obj)
    {
        return _selector(obj).GetHashCode();
    }
    public bool Equals(TObject x, TObject y)
    {
        IEqualityComparer<TProperty> comparer = 
            _internalComparer ?? EqualityComparer<TProperty>.Default;
        return comparer.Equals(_selector(x), _selector(y));
    }
}
//and here's a class to help instantiate it with anonymous objects
public static class PropertyEqualityComparer
{
    public static PropertyEqualityComparer<TObject, TProperty>
        GetNew<TObject, TProperty>(Func<TObject, TProperty> propertySelector)
    { 
        return new PropertyEqualityComparer<TObject, TProperty>
            (propertySelector);
    }
    public static PropertyEqualityComparer<TObject, TProperty>
        GetNew<TObject, TProperty>
        (Func<TObject, TProperty> propertySelector, 
        IEqualityComparer<TProperty> comparer)
    { 
        return new PropertyEqualityComparer<TObject, TProperty>
            (propertySelector, comparer);
    }
}

Here's how you would use it with your example:

var result = students.Distinct(
    PropertyEqualityComparer.GetNew(s => new { s.Name, s.DOB, s.SSN }));
It'sNotALie.
  • 22,289
  • 12
  • 68
  • 103
0

You can group by anonymous object which will contain all fields you want to group by:

from s in students
group s by new { s.Name, s.DateOfBirth, s.SSN } into g
select g

Actually you can select first item from each group

...
select g.First()

Or use some other logic, like aggregation:

...
select new {
   g.Key.Name,
   g.Key.DateOfBirth,
   g.Key.SSN,
   Fields = g.Select(x => x.FieldOfStudy).ToList()
}
Sergey Berezovskiy
  • 232,247
  • 41
  • 429
  • 459