0

I am dealing with a legacy application which is written in VB.Net 2.0 against a SQL 2000 database.

There is a single table which has ~125,000 rows and 2 pairs of fields with similar data.

i.e. FieldA1, FieldB1, FieldA2, FieldB2

I need to process a combined, distinct list of FieldA, FieldB.

Using SQL I have confirmed that there are ~140,000 distinct rows.

Due to a very restrictive framework in the application I can only retrieve the data as either 2 XML objects, 2 DataTable objects or 2 DataTableReader objects. I am unable to execute custom SQL using the framework.

Due to a very restrictive DB access policy I am unable to add a View or Stored Proc to retrieve as a single list.

What is the most efficient way to combine the 2 XML / DataTable / DataTableReader objects into a single, distinct, IEnumerable object for later processing?

Shevek
  • 3,869
  • 5
  • 43
  • 63

2 Answers2

1

I may have missed something here but could you not combine both DataTables using Merge?

DataTableA.Merge(DataTableB)

You can then use DataTableA.AsEnumerable()

Then see this answer on how to remove duplicates or

You can do this with a DataView as follows: dt.DefaultView.ToTable(True,[Column names])

Community
  • 1
  • 1
Matt Wilko
  • 26,994
  • 10
  • 93
  • 143
  • That does combine the 2 but there are duplicates – Shevek Jul 04 '12 at 11:18
  • The DataView method works perfectly with 5000 records on our DEV db but is taking a very long time to process on our TST db with 140,000 records – Shevek Jul 04 '12 at 12:44
0

This is the solution I came up with.

Combine the 2 DataTables using .Merge (thanks to Matt's answer)

Using this as a base I came up with the following code to get distinct rows from the DataTable based on 2 columns:

Private Shared Function GetDistinctRows(sourceTable As DataTable, ParamArray columnNames As String()) As DataTable

    Dim dt As New DataTable
    Dim sort = String.Empty

    For Each columnName As String In columnNames

        dt.Columns.Add(columnName, sourceTable.Columns(columnName).DataType)

        If sort.Length > 0 Then
            sort = sort & ","
        End If

        sort = sort & columnName

    Next

    Dim lastValue As DataRow = Nothing

    For Each dr As DataRow In sourceTable.Select("", sort)

        Dim add As Boolean = False

        If IsNothing(lastValue) Then
            add = True
        Else
            For Each columnName As String In columnNames
                If Not (lastValue(columnName).Equals(dr(columnName))) Then
                    add = True
                    Exit For
                End If
            Next
        End If

        If add Then
            lastValue = dr
            dt.ImportRow(dr)
        End If

    Next

    Return dt

End Function
Shevek
  • 3,869
  • 5
  • 43
  • 63