1

I'm in the process of programatically building a fairly complex SSIS data flow using the EzAPI library and have hit a snag. I've attached an image of what I'm trying to achieve.

enter image description here

I've got the start (up to the merge join) working correctly but have come unstuck trying to map the input of the merge join component to the output.

Here's the code I have so far (just a snippet obviously)

            int sortPosition;

            var df = new EzDataFlow(p);

            var cur = new EzOleDbSource(df);
            cur.Connection = dstConn;
            cur.Table = "Table1";

            var hst = new EzOleDbSource(df);
            hst.Connection = hstConn;
            hst.Table = "Table2";

            // Add all the columns to the sort transformation for the Current database table
            var sortCurr = new EzSortTransform(df);
            sortCurr.AttachTo(cur);

            sortPosition = 1;
            foreach (Column c in table.Columns)
            {
                sortCurr.SortOrder[c.ColumnName] = sortPosition++;
            }


            // Same for history
            var sortHst = new EzSortTransform(df);
            sortHst.AttachTo(hst);

            sortPosition = 1;
            foreach (Column c in table.Columns)
            {
                sortHst.SortOrder[c.ColumnName] = sortPosition++;
            }

            var mrg = new EzMergeJoin(df);

            mrg.AttachTo(sortCurr, 0, 0);
            mrg.AttachTo(sortHst, 0, 1);

            mrg.JoinType = MergeJoinType.Full;

            ... now what?

i've searched high and low for an example or documentation for the EzMergeJoin transformation but with no avail. Can anyone point me in the right direction?

Steve Homer
  • 3,852
  • 2
  • 22
  • 41
  • What's the result of running the existing code? A merge join added to the canvas, both inputs touching it but no columns mapped? – billinkc Feb 01 '12 at 18:25

4 Answers4

1

Unfortunately, we haven't found a way to fix the issue of join keys not being passed through the data flow. The only answer we have come up with is to select the join keys twice with different names in the upstream components. You then sort on only the duplicates, and this makes the merge join use the duplicates as the join keys. The original columns then pass through to the next component in the data flow.

Here's our code sample, nothing special once the input columns are set up correctly:

            MergeJoin = new EzMergeJoin(TransformTranFact);
            MergeJoin.Name = "Merge Join";
            MergeJoin.AttachTo(SourceChecksum, 0, 0);
            MergeJoin.AttachTo(FactTranFact, 0, 1);
            MergeJoin.LinkAllInputsToOutputs();
            MergeJoin.JoinType = MergeJoinType.Full;
            MergeJoin.NumKeyColumns = mchs.Count();

mchs is an XElement object that contains information from a Data Warehouse design captured in Dimodelo Architect. We use EzAPI to automate the generation of complex ETL for a Data Warehouse directly from the Data Warehouse design captured in Dimodelo Architect.

agilmore
  • 56
  • 4
0

I figured out a non-optimal but pragmatic solution for now. If you select the join key column twice in one of your queries but only set the sort option on one of them, it will link to the output of the merge join.

Alex
  • 136
  • 1
  • 2
  • 9
0

I was able to solve this by making a small modification to the LinkInputToOutput method in EzComponent. Here is the before/after:

Before:

if (InputColumnExists(inputIndex, colName))
    return; // return as this column is already linked
IDTSInput100 input = m_meta.InputCollection[inputIndex];
IDTSVirtualInput100 virtualInput = input.GetVirtualInput();
IDTSVirtualInputColumn100 virtualInputColumn = virtualInput.VirtualInputColumnCollection[colName];
m_comp.SetUsageType(input.ID, virtualInput, virtualInputColumn.LineageID, DTSUsageType.UT_READONLY);

After:

IDTSInput100 input = m_meta.InputCollection[inputIndex];
IDTSVirtualInput100 virtualInput = input.GetVirtualInput();
IDTSVirtualInputColumn100 virtualInputColumn = virtualInput.VirtualInputColumnCollection[colName];
if (InputColumnExists(inputIndex, colName))
    m_comp.SetUsageType(input.ID, virtualInput, virtualInputColumn.LineageID, DTSUsageType.UT_IGNORED);

m_comp.SetUsageType(input.ID, virtualInput, virtualInputColumn.LineageID, DTSUsageType.UT_READONLY);

You still need to manually link the column(s) that exist in both source components like this:

mergeJoin.LinkInputToOutput(0, "KeyField");
Ian Perez
  • 1
  • 1
0

As an initial answer try adding

        // Not sure how to handle multiple key columns
        mrg.NumKeyColumns = 1;
        mrg.LinkAllInputsToOutputs();

My dataflow was setup as Current branch looks like

  • source_id (sorted)
  • col1

Historical branch looks like

  • source_id (sorted)
  • col2

I only sorted on the initial column as my desire was to see an output like source_id, col1, col2.

enter image description here

I'm still looking at the code to see how to add the join key in there but haven't found it yet.

billinkc
  • 59,250
  • 9
  • 102
  • 159
  • Interestingly when I add this I get all of the columns joined between the two tables (which is good) and nothing in the output table at the bottom - not so good. I think I'm going to have to start reading the source for EzAPI at some point - it's a very useful library but the documentation is shockingly bad. – Steve Homer Feb 02 '12 at 08:46
  • Looking at the generated package from BIDS I'm almost there, looks like I just need the equivalent of ticking the "Select All Mappings" checkbox in the Merge Join properties. – Steve Homer Feb 02 '12 at 09:56
  • I am having the same problem, have you found a way to select all mappings? – Nick N. Oct 25 '13 at 09:40