0

I have a collection that contains items of a collection of integers. What I want to do is remove from the top level list the items that are an extended subset of other items.

See the following list as an example:

Item 1: 42, 40, 38, 32, 50, 28, 30, 51, 1
Item 2: 42, 38, 32, 50, 28, 30, 51, 1
Item 3: 42, 50, 28, 30, 51, 1
Item 4: 42, 51, 1

When I execute the code all items except item 4 should be removed from the list because there are extensions of Item 4.

The code below works and does the job correctly but is taking a bit longer than I would expect. Note that I have lots of items in the collection.

Can I use Linq or Collections.Set to achieve the same result faster?

Code currently used:

Public Sub RemoveExtended()

    If Me.Count < 1 Then Exit Sub

    Dim endTime As DateTime
    Dim start As DateTime

    Debug.Print("Processing:" & Me.Count - 1.ToString)

    start = Now

    For shortestIndex As Integer = 0 To Me.Count - 1

        For index As Integer = Me.Count - 1 To shortestIndex + 1 Step -1
            If ContainsAll(Me(shortestIndex), Me(index)) Then
                Me.RemoveAt(index)
            End If
        Next

    Next

    endTime = Now
    Debug.Print("removing time:" & endTime.Subtract(start).ToString)
    Debug.Print("result :" & Me.Count)

End Sub

Private Function ContainsAll(ByVal shortest As Generic.List(Of Integer), ByVal current As Generic.List(Of Integer)) As Boolean

    'slower
    'Return  shortest.All(Function(x) current.Contains(x))

    For Each Item As Integer In shortest
        If Not current.Contains(Item) Then
            Return False
        End If
    Next

    Return True

End Function
panais
  • 81
  • 1
  • 8

1 Answers1

0

You can try to change ContainsAll() with LINQ to check for subset collection :

If Not Me(shortestIndex).Except(Me(index)).Any() Then
    Me.RemoveAt(index)
End If
Community
  • 1
  • 1
har07
  • 88,338
  • 12
  • 84
  • 137