2

Thanks in advance for any help anyone can provide,

(Preface: I'm already familiar with TypeGuessRows registry tweak (I have it set to 0; XL scans the whole column to determine data type) and IMEX=1 extended properties (I use this by default))

I am starting an ongoing project for a client:

  • Client sent 10 xlsx files, 1 per year.
  • Most, if not all, files have 12 sheets...1 sheet per month.
  • All sheets in all files have the exact same number of column with the exact same column headers in the exact same order.
  • Client most likely will periodically send new data (hopefully in the same format) over the next 3 years.

Looping through multiple XL files, then looping through multiple XL sheets is not a problem. I have done that many times in the past. My SSIS template for XL files is setup that way by default.

The issue I am having is when the data types for the columns can change from sheet to sheet. For example, on most sheets a date column:

  • No NULL/blank dates
  • All dates formatted as m/d/yyyy
  • XL/SSIS assigns date [DT_DATE] data type

...but, on some sheets within the same file, the same date column...

  • No NULL/blank dates
  • Most dates formatted as m/d/yyyy
  • Some dates formatted as general/number (Nov 15, 2002 = 37575)
  • XL/SSIS assigns Unicode string [DT_WSTR] data type

If I am not mistaken, when I run the SSIS package, it will throw an error when the data types change.

Is it possible force the data type of incoming columns (Advanced Editor for OLE DB Source > Input and Output Properties sheet > Inputs and outputs pane > OLE DB Source Output > External Columns) to unicode so the package won't error when XL/SSIS wants to change the data type? This would accommodate all current files and any future ones in case the same inconsistent formatting shows up.

Or am I forced to either:

  • Change all general/number formatted dates to a date format so I can import with one SSIS package
  • Separate all consistently formatted and inconsistently formatted sheets into 2 separate groups to be imported with 2 different SSIS packages

Once again, Thanks for any help anyone can provide,

CTB

1 Answers1

1

It appears switching from one data type to another won't throw an error, just a warning...at least from [DT_DATE] to [DT_WSTR] and back.

I was not able to force the data type of the incoming column of the OLE DB Source, but I was able to set the outgoing column data type to [DT_WSTR] (Advanced Editor for OLE DB Source > Input and Output Properties sheet > Inputs and outputs pane > OLE DB Source Output > Output Columns). That way, all dates in that column were seen as unicode text in the data flow, regardless of its source.

That seemed to do the trick. I needed only one import package to import both types of sheets/files.

I hope this helps someone else in the future...