3

I have an OLE DB source that has some nulls, it has 50 columns and I'm trying to extract the rows that have NULLs to Bad data destination, the expression that I had for the conditional split is

ISNULL([StudentName]) == TRUE

But that means I have to repeat it for all the columns that I have and I'm wondering if there's another way to handle this. especially since I have multiple tables to process. Thanks

Hadi
  • 36,233
  • 13
  • 65
  • 124

5 Answers5

2

Believe it or not this has been puzzling me since you posted it.

I have not figured a way to do it SSIS but here is a possible solution using a script component source. This will find all the keys associated with a row that has at least one null column.

    DataTable dt = new DataTable();

    string sql = "Enter your extract SQL here";
    using(OleDbConnection conn = new OleDbConnection("Data Source=XXXXXXXXX;Initial Catalog=YYYYY;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;"))
    {
        using (OleDbCommand cmd = new OleDbCommand(sql,conn))
        {
            cmd.CommandType = CommandType.Text;
            conn.Open();
            dt.Load(cmd.ExecuteReader());
        }
    }

    foreach (DataRow row in dt.Rows)
    {
        foreach (DataColumn col in dt.Columns)
        {
            if (row[col] == DBNull.Value)
            {
                Output0Buffer.AddRow();
                Output0Buffer.NullRowKey = row.Field<int>(0); //This is the 0 based index of the key in the data
                break;
            }
        }
    }

The problem I was running into was trying to use Columns in an Output0Buffer data type which is what you have to use when using Script Component of a transformation. I need to actually use script component source and load the data into a data table in order to have access to a Columns property.

KeithL
  • 5,348
  • 3
  • 19
  • 25
1

The easiest solution (I am not sure if it is recommended from performance perspective) is to use an SQL command in the OLE DB Source where you should add a new column which is a concatenation of all columns. Then, you should use this column in the conditional split instead of mentioning all columns. As example:

SELECT *, [Column1] + [Column2] + [Column3] as [CheckColumn]
FROM [Table]
Hadi
  • 36,233
  • 13
  • 65
  • 124
1

Here is a completely different solution using your already formed conditional split of:

ISNULL([StudentName])

You can build an OR test in Excel by pasting a list of columns (I get this from the metadata on the data flow path). I usually use this technique for building the destination table.

Paste into excel and get the column names into Col A.

Enter this formula into ColB:

="ISNULL("&A1&")||"

Enter this formula directly in B2:

=B1&"ISNULL("&A2&")||"

This should yield a result in B2 like this:

ISNULL(Col1)||ISNULL(Col2)||

Now drag and drop B2 to the bottom of your list (in your case B50).

B50 will now hold your final formula for the conditional split:

ISNULL(Col1)||ISNULL(Col2)||ISNULL(Col3)||ISNULL(Col4)||ISNULL(Col5)||ISNULL(Col6)||ISNULL(Col7)||ISNULL(Col8)||ISNULL(Col9)||ISNULL(Col10)||ISNULL(Col11)||ISNULL(Col12)||ISNULL(Col13)||ISNULL(Col14)||ISNULL(Col15)||ISNULL(Col16)||ISNULL(Col17)||ISNULL(Col18)||ISNULL(Col19)||ISNULL(Col20)||ISNULL(Col21)||ISNULL(Col22)||ISNULL(Col23)||ISNULL(Col24)||ISNULL(Col25)||ISNULL(Col26)||ISNULL(Col27)||ISNULL(Col28)||ISNULL(Col29)||ISNULL(Col30)||ISNULL(Col31)||ISNULL(Col32)||ISNULL(Col33)||ISNULL(Col34)||ISNULL(Col35)||ISNULL(Col36)||ISNULL(Col37)||ISNULL(Col38)||ISNULL(Col39)||ISNULL(Col40)||ISNULL(Col41)||ISNULL(Col42)||ISNULL(Col43)||ISNULL(Col44)||ISNULL(Col45)||ISNULL(Col46)||ISNULL(Col47)||ISNULL(Col48)||ISNULL(Col49)||ISNULL(Col50)||

Just delete the final ||.

KeithL
  • 5,348
  • 3
  • 19
  • 25
1

Sure there is another way, why not create a script that will generate the condition for you?

if the 50 columns is not in a particular table, just create a tmp table with all the column you need ( select ..... into #tmp from .... )

Then, generate the condition.

declare @schema sysname = 'Purchasing'
declare @table sysname = 'PurchaseOrders'
        
;with cols as (
        select  convert(varchar(max), 'ISNULL(' + QUOTENAME(column_name) + ') == TRUE') as col, TABLE_SCHEMA, TABLE_NAME, ordinal_position 
        from INFORMATION_SCHEMA.COLUMNS 
        where TABLE_SCHEMA = @schema and TABLE_NAME = @table and ORDINAL_POSITION = 1
        union all
        select CONVERT (varchar(max) , cl.col + ' || ISNULL(' + QUOTENAME(column_name) + ') == TRUE') as col, c.TABLE_SCHEMA, c.TABLE_NAME, c.ordinal_position 
        from INFORMATION_SCHEMA.COLUMNS c
        inner join cols cl on cl.TABLE_SCHEMA = c.TABLE_SCHEMA and cl.TABLE_NAME = c.TABLE_NAME and c.ORDINAL_POSITION = cl.ORDINAL_POSITION + 1
    )
    select '= ' + cols.col 
    from cols
    where ORDINAL_POSITION = (select MAX(ordinal_position) from cols)

using the WideWorldImporters databases, this will result with condition, remove what not needed.

= ISNULL([PurchaseOrderID]) == TRUE || ISNULL([SupplierID]) == TRUE || ISNULL([OrderDate]) == TRUE || ISNULL([DeliveryMethodID]) == TRUE || ISNULL([ContactPersonID]) == TRUE || ISNULL([ExpectedDeliveryDate]) == TRUE || ISNULL([SupplierReference]) == TRUE || ISNULL([IsOrderFinalized]) == TRUE || ISNULL([Comments]) == TRUE || ISNULL([InternalComments]) == TRUE || ISNULL([LastEditedBy]) == TRUE || ISNULL([LastEditedWhen]) == TRUE

But beware, no matters which method you choose (mine or generating condition with Excel), testing 50 columns in SSIS might add an enormous overheap, especially if you have a lot of rows.

A better alternative would be to put a flag on each row in the dataset and filter/redirect whether "HasNullValues" is set or not.

MLeblanc
  • 1,816
  • 12
  • 21
  • That is very creative use of recursion. However, I often find that the OleDB source is very unlikely to be SQL Server when using SSIS. – KeithL Aug 11 '20 at 16:39
0

It's not clear if you meant the data bad when ALL columns are NULL or just ANY column is NULL.

If it is ALL, then you can do this:

COALESCE(col1, col2, col3, .....) IS NULL

If it is 'ANY`, then you can do the following:

(col1 + col2 + col3 + .....) IS NULL

In BOTH cases:

you will need to convert non-string columns (numeric, date, etc) to string via CONVERT(varchar(8000), col)

K4M
  • 1,030
  • 3
  • 11