-2

Alright! I have a two sets of Subscribers information each set having below information.

SubType(string),SubSchool(number),SubEffDate(date),SubTermDate(date),SubMethod(string),Verifiedby(string) etc.

I am planning to fetch and keep the data either in data table, List.

How do I compare two sets(each set row and each field) and find the unique rows and copy those unique rows into another table / List?

Example: In the below Subscriber 1 table first row all the column value are not equal to Subscriber 2 first row. So this row will copy to another temporary table.

Subscriber 1 second row (each column value) is equal to Subscriber 2 of first row ( each column value ), so we are not copying Subscriber 1 second row into temporary table.

enter image description here

MaddiSS
  • 63
  • 13
  • There are a ton of examples on compareing lists around already. https://stackoverflow.com/questions/12795882/quickest-way-to-compare-two-generic-lists-for-differences – Hursey Jun 30 '21 at 20:55

2 Answers2

0

Create a class Subscriber that implement IEquatable(Of Subscriber), then you can start comparing the sets manually or like described in the article Hursey linked.

Ideally you would have an ID that can be compared, otherwise - if each field needs to be compared - use a clever hash comparison to exclude all records that are unequal.

Here a rather sophisticated implementation that takes into account different string comparisons and attempts to be fast.

Public NotInheritable Class Subscriber
    Implements IEquatable(Of Subscriber)

    'Private Fields
    <DebuggerBrowsable(DebuggerBrowsableState.Never)>
    Private _HashCode As Int32?
    <DebuggerBrowsable(DebuggerBrowsableState.Never)>
    Private _Type As String
    <DebuggerBrowsable(DebuggerBrowsableState.Never)>
    Private _School As Int32
    <DebuggerBrowsable(DebuggerBrowsableState.Never)>
    Private _EffDate As DateTime
    <DebuggerBrowsable(DebuggerBrowsableState.Never)>
    Private _TermDate As DateTime
    <DebuggerBrowsable(DebuggerBrowsableState.Never)>
    Private _Method As String
    <DebuggerBrowsable(DebuggerBrowsableState.Never)>
    Private _VerifiedBy As String
    Private Shared ReadOnly _TypeComparer As IEqualityComparer(Of String) = StringComparer.Ordinal
    Private Shared ReadOnly _MethodComparer As IEqualityComparer(Of String) = StringComparer.OrdinalIgnoreCase
    Private Shared ReadOnly _VerifiedByComparer As IEqualityComparer(Of String) = StringComparer.OrdinalIgnoreCase

    'Constructors

    Public Sub New(type As String, school As Int32, effDate As DateTime, termDate As DateTime, method As String, verifiedBy As String)
        _Type = type
        _School = school
        _EffDate = effDate
        _TermDate = termDate
        _Method = method
        _VerifiedBy = verifiedBy
    End Sub

    'Public Properties

    Public Property Type As String
        Get
            Return _Type
        End Get
        Set(value As String)
            _Type = value?.Normalize()
            InvalidateHashCode()
        End Set
    End Property

    Public Property School As Int32
        Get
            Return _School
        End Get
        Set(value As Int32)
            _School = value
            InvalidateHashCode()
        End Set
    End Property

    Public Property EffDate As DateTime
        Get
            Return _EffDate
        End Get
        Set(value As DateTime)
            _EffDate = value
            InvalidateHashCode()
        End Set
    End Property

    Public Property TermDate As DateTime
        Get
            Return _TermDate
        End Get
        Set(value As DateTime)
            _TermDate = value
            InvalidateHashCode()
        End Set
    End Property

    Public Property Method As String
        Get
            Return _Method
        End Get
        Set(value As String)
            _Type = value?.Normalize()
            InvalidateHashCode()
        End Set
    End Property

    Public Property VerifiedBy As String
        Get
            Return _VerifiedBy
        End Get
        Set(value As String)
            _VerifiedBy = value?.Normalize()
            InvalidateHashCode()
        End Set
    End Property

    'Public Methods

    Public Overrides Function Equals(obj As Object) As Boolean
        If (TypeOf obj Is Subscriber) Then
            Return Equals(DirectCast(obj, Subscriber))
        End If
        Return False
    End Function

    Public Overloads Function Equals(other As Subscriber) As Boolean Implements IEquatable(Of Subscriber).Equals
        If (other Is Nothing) Then Return False
        If (Object.ReferenceEquals(other, Me)) Then Return True
        If (other.HashCode <> Me.HashCode) Then Return False
        If (other._School <> Me._School) Then Return False
        If (other._EffDate <> Me._EffDate) Then Return False
        If (other._TermDate <> Me._TermDate) Then Return False
        If (Not _TypeComparer.Equals(other._Type, Me._Type)) Then Return False
        If (Not _MethodComparer.Equals(other._Method, Me._Method)) Then Return False
        If (Not _VerifiedByComparer.Equals(other._VerifiedBy, Me._VerifiedBy)) Then Return False
        Return True
    End Function

    Public Overrides Function GetHashCode() As Integer
        Return HashCode
    End Function

    'Private Properties

    Private ReadOnly Property HashCode As Int32
        Get
            Dim result As Int32? = _HashCode
            If (result Is Nothing) Then
                result = 8123498 Xor _TypeComparer.GetHashCode(_Type) Xor _School Xor _EffDate.GetHashCode() Xor _TermDate.GetHashCode() Xor _MethodComparer.GetHashCode(_Method) Xor _VerifiedByComparer.GetHashCode(_VerifiedBy)
                _HashCode = result
            End If
            Return result
        End Get
    End Property

    'Private Methods

    Private Sub InvalidateHashCode()
        _HashCode = Nothing
    End Sub

End Class

You would then need LINQ or the following class to compare the sets:

Public Class SetComparison(Of TElement)

    'Constructors

    Public Sub New(set1 As IEnumerable(Of TElement), set2 As IEnumerable(Of TElement))
        'Check lists
        If (set1 Is Nothing) Then set1 = Array.Empty(Of Subscriber)()
        If (set2 Is Nothing) Then set2 = Array.Empty(Of Subscriber)()
        'Convert first set into a hash set
        Dim nullCount1 As Int32 = 0
        Dim hashSet1 As New HashSet(Of TElement)(set1.Count())
        Dim duplicates1 As New Dictionary(Of TElement, Int32)()
        For Each e As TElement In set1
            If (e Is Nothing) Then
                nullCount1 += 1
                Continue For
            End If
            If (hashSet1.Contains(e)) Then
                Dim count As Int32
                If (duplicates1.TryGetValue(e, count)) Then
                    duplicates1.Item(e) = count + 1
                Else
                    duplicates1.Item(e) = 1
                End If
            Else
                hashSet1.Add(e)
            End If
        Next
        'Convert second set into a hash set
        Dim nullCount2 As Int32 = 0
        Dim hashSet2 As New HashSet(Of TElement)(set2.Count())
        Dim duplicates2 As New Dictionary(Of TElement, Int32)()
        For Each e As TElement In set2
            If (e Is Nothing) Then
                nullCount2 += 1
                Continue For
            End If
            If (hashSet2.Contains(e)) Then
                Dim count As Int32
                If (duplicates2.TryGetValue(e, count)) Then
                    duplicates2.Item(e) = count + 1
                Else
                    duplicates2.Item(e) = 1
                End If
            Else
                hashSet2.Add(e)
            End If
        Next
        'Compare with each other
        Dim hashSetInBoth As New HashSet(Of TElement)()
        If (hashSet1.Count < hashSet2.Count) Then
            For Each e As TElement In hashSet1
                If (hashSet2.Remove(e)) Then
                    hashSetInBoth.Add(e)
                End If
            Next
            For Each e As TElement In hashSetInBoth
                hashSet1.Remove(e)
            Next
        Else
            For Each e As TElement In hashSet2
                If (hashSet1.Remove(e)) Then
                    hashSetInBoth.Add(e)
                End If
            Next
            For Each e As TElement In hashSetInBoth
                hashSet2.Remove(e)
            Next
        End If
        'Assign results
        OnlyInSet1 = hashSet1
        OnlyInSet2 = hashSet2
        InBothSets = hashSetInBoth
        DuplicatesInSet1 = duplicates1
        DuplicatesInSet2 = duplicates2
        NullElementsInSet1 = nullCount1
        NullElementsInSet2 = nullCount2
    End Sub

    'Public Properties

    Public ReadOnly Property OnlyInSet1 As HashSet(Of TElement)
    Public ReadOnly Property OnlyInSet2 As HashSet(Of TElement)
    Public ReadOnly Property InBothSets As HashSet(Of TElement)
    ''' <summary>Dictionary with the duplicate elements of set 1 and the number of duplicates found for that element (a number of 2 means there have been 3 equal elements in the set, the original and 2 duplicates).</summary>
    Public ReadOnly Property DuplicatesInSet1 As Dictionary(Of TElement, Int32)
    ''' <summary>Dictionary with the duplicate elements of set 2 and the number of duplicates found for that element (a number of 2 means there have been 3 equal elements in the set, the original and 2 duplicates).</summary>
    Public ReadOnly Property DuplicatesInSet2 As Dictionary(Of TElement, Int32)
    Public ReadOnly Property NullElementsInSet1 As Int32
    Public ReadOnly Property NullElementsInSet2 As Int32

End Class

And you would use it like this:

Sub Main(args As String())
    'Generate some data
    Dim list1 As New List(Of Subscriber)()
    list1.Add(New Subscriber("Type1", 42, New Date(2021, 1, 1), New Date(2021, 1, 1), "Method1", "VerifiedBy"))
    list1.Add(New Subscriber("Type2", 42, New Date(2021, 1, 1), New Date(2021, 1, 1), "Method1", "VerifiedBy"))
    Dim list2 As New List(Of Subscriber)()
    list2.Add(New Subscriber("Type3", 42, New Date(2021, 1, 1), New Date(2021, 1, 1), "Method1", "VerifiedBy"))
    list2.Add(Nothing)
    list2.Add(New Subscriber("Type2", 42, New Date(2021, 1, 1), New Date(2021, 1, 1), "Method1", "VerifiedBy"))
    list2.Add(New Subscriber("Type3", 42, New Date(2021, 1, 1), New Date(2021, 1, 1), "Method1", "VerifiedBy"))
    'Compare it
    Dim comparison As New SetComparison(Of Subscriber)(list1, list2)
    'Use it
    Dim onlyInSet1 As HashSet(Of Subscriber) = comparison.OnlyInSet1
    Dim onlyInSet2 As HashSet(Of Subscriber) = comparison.OnlyInSet2
    Dim inBothSets As HashSet(Of Subscriber) = comparison.InBothSets
End Sub
Christoph
  • 3,322
  • 2
  • 19
  • 28
  • Christoph, You are such a talented programmer. The way of your coding is beyond my imagination. Thanks for such a clean and clear comprehensive solution. – MaddiSS Jul 02 '21 at 03:54
0
Here is my approach. Considering two data tables for Subscribers information.
Datatable Subscriber1 = New Datatable();
Datatable Subscriber2 = New Datatable();
int intColCount = 0;

for(int i=0; i< Subscriber1.Rows.Count;i++)
{
  for(int  j=0;j<Subscriber2.Rows.Count;j++)
  {
    intColcheck=0;
    for(int c=0;c<Subscriber1.Columns.Count;c++)
    {
      if(Subscriber1.Rows[i][c].ToString()==Subscriber2.Rows[j][c].ToString())
      {
        intColcheck= intColcheck+ 1;
      }
    }
    // If all the Columns values are not equal from each row in Sub1 and Sub2
    If(intColcheck!= Subscriber.Columns.Count)
    {
      Console.WriteLine("Found the Unique Row from Subscriber1 ");
      //Insert the logic add row into third data table.
  }
}
MaddiSS
  • 63
  • 13