1

I have 2 byte arrays:

    Dim A() As Byte = {1, 2, 3, 4, 5, 6, 7, 8, 9}
    Dim B() As Byte = {5, 6, 7}

Now I want to find the occurance of the full B in A. I tried Array.IndexOf(A, B) with no luck. Is there a simple way to search an array by array without the need to use any loops?

It should find the index (position) of 5,6,7 in the same order as in B(). If A() contains {1,2,3,4,7,6,5,9} it should return false or -1 because they are not in the same order.

MilMike
  • 12,571
  • 15
  • 65
  • 82
  • 3
    almost the same: http://stackoverflow.com/questions/1020438/c-array-subset-fetching – Stefan Sep 08 '11 at 08:30
  • 2
    "without any loops" is impossible. You can't get more than one value out of an array without a loop, since the array can be any size. Do you mean without an explicit loop in your code? – Merlyn Morgan-Graham Sep 08 '11 at 08:38

5 Answers5

4

The following Linq statement will give an IEnumerable<int> containing the positions of b in a (or an empty set if none occur):

Enumerable
    .Range( 0, 1 + a.Length - b.Length )
    .Where( i => a.Skip(i).Take(b.Length).SequenceEqual(b) );

I have no idea how to translate to VB.NET.

spender
  • 117,338
  • 33
  • 229
  • 351
  • this method looks nice but is very slow. 16 Seconds for 100000 elements in byte() but it is good for smaller sizes. the length should be also +1 (at least in VB.NET) because if the elements are on the end of the list they wont be found, here is the corrected version: Enumerable.Range(0, A.Length - B.Length + 1).Where(Function(i) A.Skip(i).Take(B.Length).SequenceEqual(B)) – MilMike Sep 09 '11 at 08:21
  • @qxxx: You're right about the bug. Fixed. There are definitely faster ways to do this i.e. Knuth-Morris-Pratt, but I got hung up on the (explicit) loops requirement. – spender Sep 09 '11 at 10:31
2

This might work, but it's C# and uses a loop:

private static int[] GetIndicesOf(byte[] needle, byte[] haystack)
{
    int[] foundIndices = new int[needle.Length];

    int found = 0;

    for (int i = 0; i < haystack.Length; i++)
    {

        if (needle[found] == haystack[i])
        {
            foundIndices[found++] = i;

            if (found == needle.Length)
                return foundIndices;
        }
        else
        {
            i -= found;   // Re-evaluate from the start of the found sentence + 1
            found = 0;    // Gap found, reset, maybe later in the haystack another occurrance of needle[0] is found
            continue;
        }
    }

    return null;            
}

Tested with input:

Byte[] haystack = { 5, 6, 7, 8, 9, 0, 5, 6, 7 };
Byte[] needle = { 5, 6, 7 };
// Returns {0, 1, 2}

Byte[] haystack = { 5, 6, 0, 8, 9, 0, 5, 6, 7 };
Byte[] needle = { 5, 6, 7 };
// Returns {6, 7, 8}

Byte[] haystack = { 5, 6, 0, 7, 9, 0, 5, 6, 8 };
Byte[] needle = { 5, 6, 7 };
// Returns null

Byte[] haystack = { 1, 2, 1, 2, 2 };
Byte[] needle = { 1, 2, 2 };
// Returns {2, 3, 4}

Byte[] haystack = { 1, 2, 1, 2, 1, 2, 3 };
Byte[] needle = { 1, 2, 1, 2, 3 };
// Returns {2, 3, 4, 5, 6}

Byte[] haystack = { 1, 1, 1, 1, 2 };
Byte[] needle = { 1, 2 };
// Returns {3, 4}

But the Linq implementation of @spender looks nicer. :-P

CodeCaster
  • 147,647
  • 23
  • 218
  • 272
0

How about creating a method that:

  1. Concatinates the elements of the searched list to one string
  2. Concatinates the elements of the list to search for to one string
  3. Looks in the first string for the precense of the second string

Like so:

public bool IsSubSetOf(IList<int> list1, IList<int> list2){
   var string1 = string.Join("", list1);
   var string2 = string.Join("", list2);
   return string1.Contains(string2);
}

Not tested...

Bas Slagter
  • 9,831
  • 7
  • 47
  • 78
  • 2
    This is broken for {1, 11, 111, 11, 11} and looking for {1, 1, 1}, say. – Gleno Sep 08 '11 at 08:46
  • This is a good idea (no explicit loops!), but will break in a lot of cases. You probably need to add delimiters *around* each value (also before the first, and after the last) to get this to work flawlessly, which is slightly more complicated than a `string.Join` can provide. You also need to return an index, not just a boolean value - the OP said "find" not "determine if". This answer can be extended to work, but I wouldn't leave it as an exercise for the reader :) – Merlyn Morgan-Graham Sep 08 '11 at 08:46
0

An efficient way of solving this problem in general is the KMP algorithm. Quick googling suggest that a .NET implementation may be found here. It's implementational pseudocode is availible from Wikipedia.

An inefficient, but harmlessly easy to code way is presented in one of the links above as follows:

int[] T = new[]{1, 2, 3, 4, 5};
int[] P = new[]{3, 4};

for (int i = 0; i != T.Length; i++)
{
    int j = 0
    for (;i+j != T.Length && j != P.Length && T[i+j]==P[j]; j++);
    if (j == P.Length) return i;
}
Gleno
  • 16,621
  • 12
  • 64
  • 85
0

My take would be:

public static int Search<T>(T[] space, T[] searched) {
  foreach (var e in Array.FindAll(space, e => e.Equals(searched[0]))) {
    var idx = Array.IndexOf(space, e);
    if (space.ArraySkip(idx).Take(searched.Length).SequenceEqual(searched))
      return idx;
  }
  return -1;
}

public static class Linqy {
  public static IEnumerable<T> ArraySkip<T>(this T[] array, int index) {
    for (int i = index;  i < array.Length; i++) {
      yield return array[i];
    }
  }
}

As always, it depends on your data whether this is "good enough" or you will have to resort to more complex yet efficient algorithms. I introduced an arrayskip as the Linq skip does indeed only assume the IEnumerable interface and would enumerate up to the index.

flq
  • 22,247
  • 8
  • 55
  • 77