I have situation where in I want to find duplicate value where not all column data is same in SSIS.
I am aware of doing it using sort and aggregate method but i guess that can be done in case where all column data is same
ID | Start Date | End Time | Queue Time | Talk Time|
===============|============|==========|============|==========|
33000017670 |9/4/2017 |9/4/2017 |0:00:10 |0:03:30 |
33000017672 |9/4/2017 |9/4/2017 |0:00:10 |0:03:30 |
33000017672 |9/4/2017 |9/4/2017 |0:00:12 |0:00:00 |
33000017673 |9/4/2017 |9/4/2017 |0:00:12 |0:05:00 |
33000017674 |9/4/2017 |9/4/2017 |0:00:12 |0:12:00 |
33000017675 |9/5/2017 |9/5/2017 |0:01:12 |0:00:00 |
33000017675 |9/5/2017 |9/5/2017 |0:01:12 |0:00:00 |
Here are couple case that I want to handle in SSIS
CASE 1
So as you can see here id 33000017672 is coming twice and it is a primary key in the table that I am loading this data. The source of it is Excel. I am aware of removing this record before loading but I want to remove that process.
Now here not all the record column data is same. I want to find such record remove id which have Talk Time as 0
Case 2
Also in case of record 33000017675, all the fields are same. So in this case I want to keep on record. Note: In this case there could be more than record with same data. So I want to keep just one out of that.
Can someone help me how I can do it in SSIS