2

I am using DataRelation to compare two DataTable and return a third table containing the rows that are not share by both original tables by using this tutorial. Below is the code.

public static DataTable Difference(DataTable First, DataTable Second)
{
      //Create Empty Table

      DataTable table = new DataTable("Difference");

      //Must use a Dataset to make use of a DataRelation object

      using(DataSet ds = new DataSet())   
      {
            //Add tables
            ds.Tables.AddRange(new DataTable[]{First.Copy(),Second.Copy()});
            //Get Columns for DataRelation

            DataColumn[] firstcolumns  = new DataColumn[ds.Tables[0].Columns.Count];   
            for(int i = 0; i < firstcolumns.Length; i++)   
            {    
                  firstcolumns[i] = ds.Tables[0].Columns[i];    
            }

            DataColumn[] secondcolumns = new DataColumn[ds.Tables[1].Columns.Count];

            for(int i = 0; i < secondcolumns.Length; i++)    
            {
                  secondcolumns[i] = ds.Tables[1].Columns[i];    
            }    
            //Create DataRelation    
            DataRelation r = new DataRelation(string.Empty,firstcolumns,secondcolumns,false);

            ds.Relations.Add(r);

            //Create columns for return table   
            for(int i = 0; i < First.Columns.Count; i++)    
            {    
                  table.Columns.Add(First.Columns[i].ColumnName, First.Columns[i].DataType);    
            }

            //If First Row not in Second, Add to return table.    
            table.BeginLoadData();

            foreach(DataRow parentrow in ds.Tables[0].Rows)    
            {    
                  DataRow[] childrows = parentrow.GetChildRows(r);

                  if(childrows == null || childrows.Length == 0)    
                        table.LoadDataRow(parentrow.ItemArray,true);        
            }

            table.EndLoadData();    
      }

      return table;    
}

Surprisingly enough when I exercised this code, it only worked on few cases and failed in others.

There was one case, I was comparing two identical table (but has different source) that both has:

  • Same Column Name & DataType
  • Same Number of Rows & Columns
  • Same Value in Each Cells

Only to have a brand new table returned that actually has the exact same composition as the two original tables!

What did I possibly miss?

Is it possible that two identical tables (sharing features I mentioned above) can have other different properties (which doesn't seem visible to users' eyes)?

Or is it possible that this is actually a bad method? What are the possible alternatives?

EDITED

  1. Both tables have same primitive datatype such as: System.String, System.Int32, System.DateTime
  2. These code doesn't work on all samples I tested
  3. Here is a print screen of 1 sample (using DataSet Visualizer)

enter image description here

Cœur
  • 37,241
  • 25
  • 195
  • 267
rofans91
  • 2,970
  • 11
  • 45
  • 60
  • 1
    What are the datatypes of these columns? Is it just primitives? Perhaps they contain some objects which are not equal by reference? `(...) but has different source (...)` could be the key here. You should be aware that after something like (for example) `var bitmapA = Bitmap.FromFile(@"C:\picture.jpg"); var bitmapB = Bitmap.FromFile(@"C:\picture.jpg");` - `bitmapA == bitmapB` is `false` – Konrad Morawski Mar 29 '12 at 09:38
  • In theory the method should work. In the case where an expected match didn't work, were the columns in the same order? Can you provide a small example that doesn't work? – kaj Mar 29 '12 at 10:31
  • Tomorrow I shall, after I got back at the office =) – rofans91 Mar 29 '12 at 13:43
  • Question edited. Pls kindly check. – rofans91 Mar 30 '12 at 02:09
  • 1
    I created identical DataTables in memory with primitive data types and sample data and used your method, and the result was an empty diff table. There must be something specific about your columns or data. I would try limiting the columns that you are selecting into your tables and see if you can isolate a particular column which is causing the problem. – David Nelson Apr 01 '12 at 03:04
  • Nice suggestion.. Will give it a try.. – rofans91 Apr 02 '12 at 02:28
  • Not sure of the problem here, but you can find two (IMO better) approaches to compare two tables http://stackoverflow.com/questions/164144/c-how-to-compare-two-datatables-a-b-how-to-show-rows-which-are-in-b-but-not and http://stackoverflow.com/questions/236555/compare-datatables – Om Deshmane Apr 02 '12 at 02:53
  • I use the method `DataTablesAreSame` method from http://stackoverflow.com/questions/236555/compare-datatables and it returns true! However weirdly the method I post in the question still return a third table which is intact with the two tables being compared! – rofans91 Apr 02 '12 at 03:00
  • Do all the column names use identical casing? None of them are called ShortUrl vs. ShortURL or shortURL, correct? – Joshua Drake Apr 04 '12 at 18:34

1 Answers1

1

I had to write something similar once and this is the approach I used:

First this approach only works if you don't have duplicate rows in each table.

Using primary keys..

First.PrimaryKey = firstcolumns;
Second.PrimaryKey = secondcolumns; //These throw exceptions when you have duplicate rows

Then..

foreach (DataRow dr in Second.Rows)
{
    List<Object> l = new List<Object>();

    foreach (DataColumn dc in secondcolumns) l.Add(dr[dc]);

    if (First.Rows.Find(l.ToArray()) == null) //NOT FOUND
    {
        table.Rows.Add(l.ToArray());
    }
}

foreach (DataRow dr in First.Rows)
{
    List<Object> l = new List<Object>();

    foreach (DataColumn dc in firstcolumns) l.Add(dr[dc]);

    if (Second.Rows.Find(l.ToArray()) == null) //NOT FOUND
    {
        table.Rows.Add(l.ToArray());
    }
}

Cheers,

banging
  • 2,540
  • 1
  • 22
  • 26