3

I have the following datatable - Let's say called MyTable:

Col1:    Col2:    Col3:    Col4:
1        abc      def      <null>
2        abc      def      ghi
1        abc      def      <null>
3        abc      def      <null>
1        abc      def      <null>

And I'm trying to get the distinct rows:

Col1:    Col2:    Col3:    Col4:
1        abc      def      <null>
2        abc      def      ghi
3        abc      def      <null>

I tried the following LINQ statement:

MyTable = (From dr As DataRow In MyTable Select dr).Distinct.CopyToDataTable

But it's returning the original datatable with the repeated rows back to me.

What am I doing wrong AND how can I get my desired output??

Soner Gönül
  • 97,193
  • 102
  • 206
  • 364
John Bustos
  • 19,036
  • 17
  • 89
  • 151
  • check this - http://stackoverflow.com/questions/3242892/select-distinct-rows-from-datatable-in-linq – rs. Jan 04 '13 at 21:38

4 Answers4

6

Distinct relies on the objects all implementing IEquatable and/or having sensible implementations of GetHashCode and Equals. The DataRow class...doesn't. It doesn't check that the value of each column is equal in the Equals method, it just uses the default implementation, which is to say that it checks that the references are equal, and that's not what you want.

You can provide your own IEqualityComparer, since you can't change the methods in DataRow.

The following should work, if you provide an instance of it to to Distinct:

public class DataRowComparer : IEqualityComparer<DataRow>
{
    public bool Equals(DataRow x, DataRow y)
    {
        for (int i = 0; i < x.Table.Columns.Count; i++)
        {
            if (!object.Equals(x[i], y[i]))
                return false;
        }
        return true;
    }

    public int GetHashCode(DataRow obj)
    {
        unchecked
        {
            int output = 23;
            for (int i = 0; i < obj.Table.Columns.Count; i++)
            {
                output += 19 * obj[i].GetHashCode();
            }
            return output;
        }
    }
}
Servy
  • 202,030
  • 26
  • 332
  • 449
  • Thanks, @Servy - Makes sense... How could I get my desired result? I'm guessing i'm going to have to look at `ItemArray`, but can't figure out how to do it... – John Bustos Jan 04 '13 at 21:38
  • Adding to @Servy Answer above, you can use Distinct(DataRowComparer.Default) to get unique rows – Ninja Aug 21 '23 at 02:37
2

Try like this;

var distinctRows = (from DataRow dRow in MyTable.Rows
                    select new {col1=dRow["Col1"],col2=dRow["Col2"], col3=dRow["Col3"], col4=dRow["Col4"]}).Distinct();

foreach (var row in distinctRows) 
{
 //
}
Soner Gönül
  • 97,193
  • 102
  • 206
  • 364
  • Thanks @Soner - I found this solution online in a few places, but my example really trivialized the problem - My real datatable has about 20 columns... Is there a way to do it without having to mention each column explicitly?? – John Bustos Jan 04 '13 at 21:41
  • @JohnBustos Hmm.. I'm not sure but check http://stackoverflow.com/questions/1681715/how-can-i-access-the-loop-index-inside-a-linq-select and http://stackoverflow.com/questions/1044236/nested-foreach-vs-lambda-linq-query-performancelinq-to-objects – Soner Gönül Jan 04 '13 at 21:44
  • @JohnBustos Yes, there is a way to do it without mentioning each option. See my answer. It should work for any datatable. – Servy Jan 04 '13 at 21:46
1

Although @Servy's question truly was the correct way to do this - Thank you so much Servy! I also found another solution that at least seemed cleaner:

    MyTable = MyTable.DefaultView.ToTable(True)

This allows you to ask for only distinct records, but negates by original request of doing it via LINQ - Figured I'd add it in for anyone else looking at this question in the future.

John Bustos
  • 19,036
  • 17
  • 89
  • 151
0

The problem is that every DataRow in your DataTable is a different object. Each is an instance.

Soner Gönül
  • 97,193
  • 102
  • 206
  • 364
Arturo Martinez
  • 3,737
  • 1
  • 22
  • 35