0

I'd like to extract data from excel file to OLE DB by using SSIS. The source files are some excel files. The identification for each entry is the date.There are some duplicated data between these files.

For example

In file1, the date of entries are from 01/01/13 to 01/15/13.

In file2, the date of entries are from 01/01/13 to 01/31/13. file2 has all the entries which file1 has.

If I run the SSIS package with file1, then all entries in file1 will be imported to OLE DB. If I run the package with file2, then only the different parts in file2 will be imported to OLE DB, the parts which file1 has will not be imported to OLE DB so that it avoids the duplication. If I run file2 one more time, then nothing happens, because all data is in the OLE DB and the new imported data is the duplicated one.

What function should I use in SSIS to do this?

Samual
  • 1
  • 1
  • Which part are you having trouble with? Parsing the files? Trimming out duplicate entries? Putting the non-duplicates in the database? – Pete Baughman Apr 08 '14 at 21:51
  • If you can use the staging table. Stage the data from file1, file 2.....file n. Then data is yours, just use the distinct query to load to your respective table. Hope it helps – Gowdhaman008 Apr 09 '14 at 07:19

1 Answers1

0

I would add a Lookup transformation to the Data Flow Task. This would match on the Date column from the target tables. On the Lookup / General page I would choose: Redirect rows to no match ouput.

Then I would connect the Lookup / No Match output to the Destination.

Mike Honey
  • 14,523
  • 1
  • 24
  • 40