0

Hi I have a table that looks like below:

First Name | Last Name | Employee number
James        Smith       1234
James Tim    Smith       1234
Jim          Baker       1234
Jen          King        5678
Jenny        King        5678

I want to return the employees where the same employee number is assigned to different people based on first name and last name combinations. I basically want to return the records that map to employee number 1234, as Jim Barker and James Smith are definetely not the same people however we can assume that Jen King and Jenny King are the same people.

Thanks

Ilyes
  • 14,640
  • 4
  • 29
  • 55
elmify
  • 105
  • 1
  • 5
  • 16
  • edit the question with desired result. – Yogesh Sharma Oct 26 '17 at 14:09
  • `Fuzzy string comparison` might help. Google and check https://stackoverflow.com/questions/921978/fuzzy-matching-using-t-sql – Prabhat G Oct 26 '17 at 14:10
  • 2
    Seems like a schema problem. You should have a table that stores the names and the employee number, where the employee number is a primary key that _cannot be duplicated at all_. Everywhere else should only use the employee number, and not duplicate storing the names. – Joel Coehoorn Oct 26 '17 at 14:12
  • "we can assume that Jen King and Jenny King are the same people". And so would "Anna Smith" and "Annabel Smith"? And "Anna Smith" and "Annamaria Smith"? So "Annabel Smith" would be "Annamaria Smith"? Obviously you need more than just character comparision. Some table containing name matches maybe or better doing comparisions manually for all employee numbers with more then one distinct name. – Thorsten Kettner Oct 26 '17 at 14:40
  • you should group the result by employe number and filter the results twice by grouping again by last name or first name – Robert P Oct 26 '17 at 16:08

1 Answers1

0

I would use window functions:

select e.*
from (select e.*, count(*) over (partition by employeenumber) as cnt
      from employees e
     ) e
where cnt > 1
order by employeenumber;

You can then evaluate whether the employee number duplicates make sense or not.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786