1

I have a generic dot NET List A that I want to find in List B - how do I find List A in List B? I need the index of where List A starts in List B.

Neal Davis
  • 648
  • 6
  • 21
  • 2
    sounds like a modified version of Knuth–Morris–Pratt could be applied here http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm – BrokenGlass Jun 18 '13 at 03:00
  • @BrokenGlass: Or just a simple naïve "substring" search. – jason Jun 18 '13 at 04:04
  • Do you want to implement this for Lists of String or also for Lists of SomeObject? – Fabian Bigler Jun 18 '13 at 07:28
  • Additional Info: Both List A and List B are declared as List; I was thinking there might be a way to do this with LINQ? Find List A in List B? – Neal Davis Jun 18 '13 at 15:24

4 Answers4

2

There is a very naïve approach that is O(|A| * |B|). Basically, this is the naïve substring searching algorithm1:

for(int i = 0; i < B.Count - A.Count; i++) {
     bool matchPossible = true;
     for(int j = 0; matchPossible && j < A.Count; j++) {
         if (!B[i + j].Equals(A[j])) { // should check for B[i + j] == null
              matchPossible = false;
              break;
         }
     }
     if(matchPossible) {
         return i;
     }
 }     
 return -1;

I left out some obvious error checking that you should be doing so you can focus on the approach. I commented one of the obvious checks. This will give you the index in B where A can be found.

I'd stick with this unless benchmarking shows that the approach is dragging you down. If it is, you need to look at something more sophisticated like Knuth-Morris-Pratt.

Both List A and List B are declared as List<string>; I was thinking there might be a way to do this with LINQ? Find List A in List B?

Sure.

for(int i = 0; i < B.Count - A.Count; i++) {
    if(B.SequenceEqual(A.Skip(i).Take(B.Count))) {
        return i;
    }
}
return -1;

Note that this is basically the same algorithm that I gave above, just expressed a little bit more cleanly. However, there is a downside to this approach. I don't know if Enumerable.Skip is smart enough to use the indexer when it's available. This version might be less performant than the original version if it does not use the indexer. This is the reason that I did not use this in my initial approach.

Also, you'll have to translate to VB.NET, sorry; I don't have a compiler handy and I speak fluent C# (so don't need a compiler to check my syntax), but not VB.

1: For some reason, I had this sudden flashback that this approach is covered in K&R's little C book? Can anyone verify; I've no idea where my copy is right now?2

2: Found it. Yes, section 4.1. Function is strindex.

jason
  • 236,483
  • 35
  • 423
  • 525
  • Thanks Jason - for your algorithm above I accepted it; I just need to convert it to VB.NET. I also will run tests on @SysDragon below. – Neal Davis Jun 18 '13 at 16:14
2

Here is a generic implementation of this answer that allows any IList and can pass in a optional IEqualityComparer. It's not the fastest way to search but it is a good starting point to do a more advanced answer.

static class GenericSearcher
{

    static readonly int[] Empty = new int[0];

    public static int[] Locate<T>(this IList<T> self, IList<T> candidate)
    {
        return Locate(self, candidate, EqualityComparer<T>.Default);
    }

    public static int[] Locate<T>(this IList<T> self, IList<T> candidate, IEqualityComparer<T> comparer)
    {
        if (IsEmptyLocate(self, candidate))
            return Empty;

        var list = new List<int>();

        for (int i = 0; i < self.Count; i++)
        {
            if (!IsMatch(self, i, candidate, comparer))
                continue;

            list.Add(i);
        }

        return list.Count == 0 ? Empty : list.ToArray();
    }

    static bool IsMatch<T>(IList<T> array, int position, IList<T> candidate, IEqualityComparer<T> comparer)
    {
        if (candidate.Count > (array.Count - position))
            return false;

        for (int i = 0; i < candidate.Count; i++)
            if (comparer.Equals(array[position + i],candidate[i]) == false)
                return false;

        return true;
    }

    static bool IsEmptyLocate<T>(ICollection<T> array, ICollection<T> candidate)
    {
        return array == null
            || candidate == null
            || array.Count == 0
            || candidate.Count == 0
            || candidate.Count > array.Count;
    }

}

UPDATED WITH VB CODE (thanks ajakblackgoat):

Module GenericSearcher

    Private ReadOnly Empty As Integer() = New Integer(-1) {}

    <System.Runtime.CompilerServices.Extension> _
    Public Function Locate(Of T)(self As IList(Of T), candidate As IList(Of T)) As Integer()
        Return Locate(self, candidate, EqualityComparer(Of T).[Default])
    End Function

    <System.Runtime.CompilerServices.Extension> _
    Public Function Locate(Of T)(self As IList(Of T), candidate As IList(Of T), comparer As IEqualityComparer(Of T)) As Integer()
        If IsEmptyLocate(self, candidate) Then
            Return Empty
        End If

        Dim list = New List(Of Integer)()

        For i As Integer = 0 To self.Count - 1
            If Not IsMatch(self, i, candidate, comparer) Then
                Continue For
            End If

            list.Add(i)
        Next

        Return If(list.Count = 0, Empty, list.ToArray())
    End Function

    Private Function IsMatch(Of T)(array As IList(Of T), position As Integer, candidate As IList(Of T), comparer As IEqualityComparer(Of T)) As Boolean
        If candidate.Count > (array.Count - position) Then
            Return False
        End If

        For i As Integer = 0 To candidate.Count - 1
            If Not comparer.Equals(array(position + i), candidate(i)) Then
                Return False
            End If
        Next

        Return True
    End Function

    Private Function IsEmptyLocate(Of T)(array As ICollection(Of T), candidate As ICollection(Of T)) As Boolean
        Return array Is Nothing OrElse candidate Is Nothing OrElse array.Count = 0 OrElse candidate.Count = 0 OrElse candidate.Count > array.Count
    End Function

End Module
Community
  • 1
  • 1
Scott Chamberlain
  • 124,994
  • 33
  • 282
  • 431
1

Try this function that receives ListA and ListB:

Dim index As Integer = listB.IndexOf(listA(0))
Dim iCont As Integer
Dim bMatch As Boolean

While index >= 0
    bMatch = True
    iCont = 1

    'Check if lists match on all the items in list A
    While bMatch
        index += 1
        bMatch = (index < listB.Count) AndAlso (listA(iCont) = listB(index))

        iCont += 1
        If iCont >= ListA.Count Then Exit While
    End While         

    If bMatch Then Exit While
    index = listB.IndexOf(listA(0), index)
End While

return bMatch
SysDragon
  • 9,692
  • 15
  • 60
  • 89
  • Does your VB code shown here essentially match Jason's C# code above? It appears to me they are approximately the same algorithm except yours is in VB? – Neal Davis Jun 18 '13 at 15:48
  • What are you triying to say? They are not the same algorithm. There are very different, but yes, they have some things in common. I didn't even see his answer (is not vb.net). Basically, both answers share the idea of go over both lists checking the similitudes. Mine is jumping to the elements and Jason answer is going through all the elements. – SysDragon Jun 18 '13 at 15:55
  • Thanks: I accepted both your answer and Jason's answer; I plan to test both. The reason for my question is I do almost all my work in C#, but this project is in VB.NET - I'm just not as familiar with VB as C#. I will examine both yours and Jason's above and probably run tests to see how they perform. Thanks again for your response! – Neal Davis Jun 18 '13 at 16:10
0

Here's a fairly simple approach. It should work for any type that has an equality comparer. I tested it with both string and integer lists. This function only iterates by the first element of ListB. therefore it will only step as many times as there are elements in ListA that equal the first element in ListB up to the one that starts the equal sequence.

Private Function FindList(Of T)(ListA As IList(Of T), ListB As List(Of T)) As Integer
    Dim FoundIndex As Integer = ListA.IndexOf(ListB(0))
    Dim FoundlList As Boolean = False
    While FoundIndex <> -1
        If ListA.Skip(FoundIndex).Take(ListB.Count).SequenceEqual(ListB) Then
            Return FoundIndex
        End If
        FoundIndex = FindIndex(ListA, FoundIndex, ListB(0))
    End While
    Return FoundIndex
End Function
Private Function FindIndex(Of T)(List As IList(Of T), Start As Integer, FindItem As T) As Integer
    Dim TempList = List.Skip(Start + 1).ToList
    Return TempList.IndexOf(FindItem) + (List.Count - TempList.Count)
End Function
tinstaafl
  • 6,908
  • 2
  • 15
  • 22