2

I have an Employee class that has the following:

public class Employee
{
  //EmployeeNumber cannot be the same as the Id
  public int EmployeeNumber {get; set; }
  public string EmployeeName {get; set }
}

Ultimately I am going to be updating the database with new employees. I have a list of the new employees, and I have a list of the current employees that exist in the database. Employee name can be the same, but EmployeeNumber has to be unique. I want to ultimately have a list of duplicate employees that has been created from comparing the list I will be adding to the database, with the list of employees that represents what is inside the database.

What is the best way to get a list of the duplicate employees using LINQ?

rocat
  • 109
  • 9
  • 2
    possible duplicate of [C# LINQ find duplicates in List](http://stackoverflow.com/questions/18547354/c-sharp-linq-find-duplicates-in-list) – bkribbs Aug 27 '15 at 17:04
  • 1
    declare the EmployeeNumber as the table key, then there is not need to check for duplicates – laskdjf Aug 27 '15 at 17:06
  • 1
    @bkribbs I think he's asking how can he confirm that the list of "new" employees has no records in the existing database that are already saved. – Michael Blackburn Aug 27 '15 at 17:08
  • @FahadJameel, you are correct, but please submit that comment as an "answer," not a comment. Bonus points if you provide the code to implement what you suggest. – Michael Blackburn Aug 27 '15 at 17:08
  • 1
    @MichaelBlackburn I did not show it here, but Employee has a unique key as well, but for the purpose of the application the key cannot be the same as the employee number. The key is constant, but the employee number may be changed by the user - I will add this to the code snippet – rocat Aug 27 '15 at 17:12

4 Answers4

2

The correct way to do it would be to declare the EmployeeNumber as the table key, then there is not need to check for duplicates.

public class Employee
{
  [Key]
  public int EmployeeNumber {get; set; }
  public string EmployeeName {get; set }
}

Also in your database you would declare the EmployeeNumber as the primary key. Assuming you are using SQL Server, you can add Identity(1,1) to make it auto-increment.

Here is a sample of how your table definition might look:

CREATE TABLE Persons
(
  EmployeeNumber int IDENTITY(1,1) PRIMARY KEY,
  EmployeeName varchar(255) NOT NULL,
)
laskdjf
  • 1,166
  • 1
  • 11
  • 28
1

I'm not sure if it's the most efficient (That award goes to @Fahad's comment).

Assuming you really mean "How do I get a list of records that appear in two collections," I like to use the Join or GroupJoin methods, as you can select a new collection, or an anonymous type that contains the record from both collections.

The Syntax is Join (this collection1, collection2, FuncCollection1Key, Funccollection2Key, FuncOutputSelection).

So if your "new" collection is IEnumerable NewEmployees and your existing collection is IEnumerable<Employee> DbEmployees your collection of duplicate employees is derived by:

var DupeEmployees = NewEmployees.Join(DbEmployees, n=>n.EmployeeNumber, d=>d.EmployeeNumber, (nEmp,dbEmp)=>nEmp);

The two "middle" lambda expressions must be functions that result in the same type of value (that implements IEquatable), but there's no other restriction. You have have two collections of different types, and you can output anything you like.

Now, the best way to do this is Farhad's suggestion, using Join in this case is a bit like shooting a bee with an Elephant gun, but understanding Join will return you many benefits down the road.

Michael Blackburn
  • 3,161
  • 1
  • 25
  • 18
  • This is what I would like to do, I basically need the list of EmployeeNames that appear in what a user is trying to upload, and what the database already has. If the list is empty, then I am good to go, otherwise report to the user which employees are already existing. Sorry if I did not make this clear – rocat Aug 27 '15 at 17:21
  • If you just need the EmployeeName, make the selection lambda `(nEmp,dbEmp)=>nEmp.EmployeeName` If you need to see both objects (they may need to be merged, for example) you can create a new anonymous type: `(nEmp,dbEmp)=>new {IncomingEmployee = nEmp, ExistingEmployee=dbEmp}` – Michael Blackburn Aug 27 '15 at 17:29
1

You can implement IEqualityComparer and use the LinQ method "Except"

public class MyComparer : IEqualityComparer<Employee>
{
    public bool Equals(Employee x, Employee y)
    {
        return x.EmployeeNumber.Equals(y.EmployeeNumber);
    }

    public int GetHashCode(Employee x)
    {
        return x.EmployeeNumber.GetHashCode()
    }
}
  • This is a very good technique when the comparison is between the same type of object, and the the comparison is complicated, or frequently performed. – Michael Blackburn Aug 27 '15 at 17:25
0

You can just check whether EmployeeNumber of current employee is available in the list of new employees.

 List<Employee> currentEmployees = ...
 List<Employee> newEmployees = ...

 List<Employee> duplicateEmployees = currentEmployees.Where(currentEmployee => (newEmployees.Select(f => f.EmployeeNumber)).Contains(currentEmployee.EmployeeNumber)).ToList();
Bhushan Firake
  • 9,338
  • 5
  • 44
  • 79