1

I need to exclude a column from a DataSet Query where that column only contains 0's (Nulls).

All solutions I have found (for example: Filtering Null values in Select) only process the condition per row and won't take into to account all the values in the column as I need.

The query needs to be generic to be used across multiple different data tables so I cannot explicitly state column names to exclude.

The latest code I've tried is:

theTable = result.Tables[0];
var query = theTable.AsEnumerable().Select(r => r.ItemArray.Where(c => long.Parse(c.ToString()) != 0));

This excludes all 0's in the row, but it does not maintain the column structure and I end up with rows with different Lengths (column sizes).

This is my example SQL for reference:

SELECT t1.TableIndex
, CASE WHEN t1.EntityName <> t2.EntityName THEN 1 ELSE 0 END AS EntName
, CASE WHEN t1.EntityNumber <> t2.EntityNumber THEN 1 ELSE 0 END AS EntNumber

FROM DbEnv.dbo.tblOne t1 (NOLOCK)
INNER JOIN DbEnv.dbo.tblTwo t2 (nolock) ON t1.TableIndex = t2.TableIndex

WHERE t1.EntityName <> t2.EntityName
OR t1.EntityNumber <> t2.EntityNumber

Example Data Set (In this scenario, only Col2 should be excluded):

     Col1 | Col2 | Col3
Row1: 0      0      1
Row2: 1      0      1
Row3: 0      0      0

Example Data Set 2 (In this scenario, Col1 and Col4 should be excluded):

     Col1 | Col2 | Col3 | Col4 | Col5
Row1: 0      0      1      0      1
Row2: 0      0      1      0      1
Row3: 0      1      0      0      1
Row3: 0      1      0      0      1

(Solution can be in SQL or LINQ, but I would think it would be cleaner to have the solution in LINQ)

Community
  • 1
  • 1
MikeDub
  • 5,143
  • 3
  • 27
  • 44

1 Answers1

1

This can be done in three queries - one running against your RDBMS, and two in-memory queries:

  • First query would read all columns, and bring them into memory
  • Second query would determine which columns to keep
  • Third query would project out the columns that you do not need

First query would be a "plain" query for all columns:

var allRows = theTable.ToList();

The second query could go like this:

var columnsToKeep = Enumerable
    .Range(0, columnCount)
    .Where(i => allRows.Any(r => r.ItemArray[i] != null && long.Parse(r.ItemArray[i].ToString()) != 0 ))
    .ToList();

The third query would be like this:

var query = allRows.Select(r =>
    columnsToKeep.Select(i => r.ItemArray[i]).ToArray()
);
Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • Your last line of code: 'r.ItemArray[i]' produces an error -> "Cannot implicitly convert type 'bool' to 'int" ... were you trying to do this: r.ItemArray[columnsToKeep.IndexOf(i)] ? – MikeDub Feb 26 '16 at 04:13
  • @MikeDub Ah, That's because I used `Select` instead of `Where` for the second query. – Sergey Kalinichenko Feb 26 '16 at 04:16
  • when using the 2nd query: var columnsToKeep = Enumerable.Range(0, theTable.Columns.Count) .Select(i => allRows.All(r => r.ItemArray[i] != null && long.Parse(r.ItemArray[i].ToString()) != 0)) .ToList(); It is not correctly returning which columns to keep. – MikeDub Feb 26 '16 at 04:19
  • If I change Q2 to: var columnsToExclude = Enumerable.Range(0, theTable.Columns.Count) .Select(i => allRows.All(r => r.ItemArray[i] != null && long.Parse(r.ItemArray[i].ToString()) == 0)) .ToList(); ... It seems to work correctly. Could you extroplate on how to fix Query 3? – MikeDub Feb 26 '16 at 04:24
  • Yep, but it only works if I change it to columnsToExclude, just trying to figure the best way to have it exclude those columns in query 3 – MikeDub Feb 26 '16 at 04:39