3

I have a DataTable dt_Candidates

      Candidate      |   First Name   |   Last Name   
 --------------------|----------------|--------------- 
  John, Kennedy      | John           | Kennedy       
  Richard, Nixon     | Richard        | Nixon         
  Eleanor, Roosevelt | Eleanor        | Roosevelt     
  Jack, Black        | Jack           | Black         
  Richard, Nixon     | Richard        | Nixon         

I want to create without a nested loops and preferably using Linq, a DataTable containing ONLY unique values like this one called dt_Candidates2:

      Candidate      |   First Name   |   Last Name   
 --------------------|----------------|--------------- 
  John, Kennedy      | John           | Kennedy       
  Eleanor, Roosevelt | Eleanor        | Roosevelt     
  Jack, Black        | Jack           | Black         

And a list or an array called RejectedCandidates containing the distinct duplicates

RejectedCandidates = {"Richard, Nixon"}
ASh
  • 34,632
  • 9
  • 60
  • 82
  • Do you want to enforce uniqueness based on the `Candidate` column? – Peter Csala Nov 25 '20 at 10:13
  • I don't think LINQ is a good tool for this; add a primary key to the table and adjust the logic that creates the table to use it (or catch the error thrown when adding a repeated value) – Caius Jard Nov 25 '20 at 10:21
  • @PeterCsala Yes, I do! As far as the whole logic goes, I just changed my mind another time... Another workaround could be creating the RejectedCandidates and when facing the ForEach loop to work on the single value checking `RejectedCandidates.Any(row("Candidate").ToString.Contains)` so it won't work `RejectedCandidates` and it can provide a viable feedback... – Fabio Craig Wimmer Florey Nov 25 '20 at 10:40
  • @CaiusJard Thank you for your useful tips! I'm still very new to C# and I used found Linq to be very handy, I think my logic has been faulted by the mere-exposure effect! :) – Fabio Craig Wimmer Florey Nov 25 '20 at 10:44
  • One of my favorite sayings is "LINQ is a hammer.. but not every problem is a nail" ;) – Caius Jard Nov 25 '20 at 10:47

2 Answers2

1

As noted, I don't think it really needs LINQ here. It can go something like this:

DataTable dt = new DataTable();
dt.Columns.Add("Candidate");
dt.Columns.Add("First");
dt.Columns.Add("Last");
dt.PrimaryKey = new []{ dt.Columns["Candidate"] }; //means that dt.Find() will work

while(...){
  string candidate = ...

  if(dt.Rows.Find(candidate) != null)
    RejectList.Add(...);
  else
    dt.Rows.Add(...);
}

Avoid using LINQ's .Any on a DataTable for this. Not only is it a pain to get going because it needs casting steps or extension libraries (see here) to, it will then use loops to find the info you seek; the built-in mechanism for the PrimaryKey uses hash tables for much faster lookups.

Caius Jard
  • 72,509
  • 5
  • 49
  • 80
  • adding PrimaryKey to a table with duplicates throws `System.ArgumentException: These columns don't currently have unique values` or `System.Data.ConstraintException: Column 'Candidate' is constrained to be unique. Value 'Richard, Nixon' is already present` depending on when rows are added (before or after PK). am I missing smth in your solution? – ASh Nov 30 '20 at 10:54
  • You're supposed to add the PK to a table that doesn't have duplicates. The code in the answer makes a new table, adds columns, adds a key and then fills the table. As it is filling it is checking `if` the value is present and if it is, it is putting the value into the reject list instead. In other words; whatever code you have that fills a table with duplicates, replace it with this notion – Caius Jard Nov 30 '20 at 11:39
0
var dt = new DataTable
{
    Columns = {"Candidate", "First Name", "Last Name"},
    Rows = 
    {
        new object [] { "John, Kennedy", "John", "Kennedy"},
        new object [] { "Richard, Nixon", "Richard", "Nixon"},
        new object [] { "Eleanor, Roosevelt", "Eleanor", "Roosevelt"},
        new object [] { "Jack, Black", "Jack", "Black"},
        new object [] { "Richard, Nixon", "Richard", "Nixon"},
    }
};

you can use grouping (groupBy) to find duplicates, filter them out, and then create a new DataTable, using DataTableExtensions.CopyToDataTable extension method:

var dt2 = dt.AsEnumerable()

            .GroupBy(r => r["Candidate"])
            .Where(g => g.Count() == 1)

            .Select(g => g.First())
            .CopyToDataTable();
ASh
  • 34,632
  • 9
  • 60
  • 82