30

I was recently asked to write 3 test programs for a job. They would be written using just core Java API's and any test framework of my choice. Unit tests should be implemented where appropriate.

Although I haven't received any feedback at all, I suppose they didn't like my solutions (otherwise I would have heard from them), so I decided to show my programs here and ask if this implementation can be considered good, and, if not, then why?

To avoid confusion, I'll ask only first one for now.

Implement a function that finds an array in another larger array. It should accept two arrays as parameters and it will return the index of the first array where the second array first occurs in full. Eg, findArray([2,3,7,1,20], [7,1]) should return 2.

I didn't try to find any existing solution, but instead wanted to do it myself.

Possible reasons: 1. Should be static. 2. Should use line comments instead of block ones. 3. Didn't check for null values first (I know, just spotted too late). 4. ?

UPDATE:
Quite a few reasons have been presented, and it's very difficult for me to choose one answer as many answers have a good solution. As @adietrich mentioned, I tend to believe they wanted me to demonstrate knowledge of core API (they even asked to write a function, not to write an algorithm).

I believe the best way to secure the job was to provide as many solutions as possible, including: 1. Implementation using Collections.indexOfSubList() method to show that I know core collections API. 2. Implement using brute-force approach, but provide a more elegant solution. 3. Implement using a search algorithm, for example Boyer-Moore. 4. Implement using combination of System.arraycopy() and Arrays.equal(). However not the best solution in terms of performance, it would show my knowledge of standard array routines.

Thank you all for your answers!
END OF UPDATE.

Here is what I wrote:

Actual program:

package com.example.common.utils;

/**
 * This class contains functions for array manipulations.
 * 
 * @author Roman
 *
 */
public class ArrayUtils {

    /**
     * Finds a sub array in a large array
     * 
     * @param largeArray
     * @param subArray
     * @return index of sub array
     */
    public int findArray(int[] largeArray, int[] subArray) {

        /* If any of the arrays is empty then not found */
        if (largeArray.length == 0 || subArray.length == 0) {
            return -1;
        }

        /* If subarray is larger than large array then not found */
        if (subArray.length > largeArray.length) {
            return -1;
        }

        for (int i = 0; i < largeArray.length; i++) {
            /* Check if the next element of large array is the same as the first element of subarray */
            if (largeArray[i] == subArray[0]) {

                boolean subArrayFound = true;
                for (int j = 0; j < subArray.length; j++) {
                    /* If outside of large array or elements not equal then leave the loop */
                    if (largeArray.length <= i+j || subArray[j] != largeArray[i+j]) {
                        subArrayFound = false;
                        break;
                    }
                }

                /* Sub array found - return its index */
                if (subArrayFound) {
                    return i;
                }

            }
        }

        /* Return default value */
        return -1;
    }

}

Test code:

package com.example.common.utils;

import com.example.common.utils.ArrayUtils;

import junit.framework.TestCase;

public class ArrayUtilsTest extends TestCase {

    private ArrayUtils arrayUtils = new ArrayUtils();

    public void testFindArrayDoesntExist() {

        int[] largeArray = {1,2,3,4,5,6,7};
        int[] subArray = {8,9,10};

        int expected = -1;
        int actual = arrayUtils.findArray(largeArray, subArray);

        assertEquals(expected, actual);
    }

    public void testFindArrayExistSimple() {

        int[] largeArray = {1,2,3,4,5,6,7};
        int[] subArray = {3,4,5};

        int expected = 2;
        int actual = arrayUtils.findArray(largeArray, subArray);

        assertEquals(expected, actual);
    }

    public void testFindArrayExistFirstPosition() {

        int[] largeArray = {1,2,3,4,5,6,7};
        int[] subArray = {1,2,3};

        int expected = 0;
        int actual = arrayUtils.findArray(largeArray, subArray);

        assertEquals(expected, actual);
    }

    public void testFindArrayExistLastPosition() {

        int[] largeArray = {1,2,3,4,5,6,7};
        int[] subArray = {5,6,7};

        int expected = 4;
        int actual = arrayUtils.findArray(largeArray, subArray);

        assertEquals(expected, actual);
    }

    public void testFindArrayDoesntExistPartiallyEqual() {

        int[] largeArray = {1,2,3,4,5,6,7};
        int[] subArray = {6,7,8};

        int expected = -1;
        int actual = arrayUtils.findArray(largeArray, subArray);

        assertEquals(expected, actual);
    }

    public void testFindArrayExistPartiallyEqual() {

        int[] largeArray = {1,2,3,1,2,3,4,5,6,7};
        int[] subArray = {1,2,3,4};

        int expected = 3;
        int actual = arrayUtils.findArray(largeArray, subArray);

        assertEquals(expected, actual);
    }

    public void testFindArraySubArrayEmpty() {

        int[] largeArray = {1,2,3,4,5,6,7};
        int[] subArray = {};

        int expected = -1;
        int actual = arrayUtils.findArray(largeArray, subArray);

        assertEquals(expected, actual);
    }

    public void testFindArraySubArrayLargerThanArray() {

        int[] largeArray = {1,2,3,4,5,6,7};
        int[] subArray = {4,5,6,7,8,9,10,11};

        int expected = -1;
        int actual = arrayUtils.findArray(largeArray, subArray);

        assertEquals(expected, actual);
    }

    public void testFindArrayExistsVeryComplex() {

        int[] largeArray = {1234, 56, -345, 789, 23456, 6745};
        int[] subArray = {56, -345, 789};

        int expected = 1;
        int actual = arrayUtils.findArray(largeArray, subArray);

        assertEquals(expected, actual);
    }

}
Roman
  • 1,278
  • 2
  • 12
  • 14
  • Why if any of the arrays is empty then not found? An empty array is always member of any array, right? Also an empty array is a member of empty array – nanda Oct 15 '10 at 07:57
  • This is a good point, but returning 0 wouldn't be correct in this case... I think returning -1 seems appropriate here... – Roman Oct 15 '10 at 08:12

15 Answers15

42

The requirement of "using just core Java API's" could also mean that they wanted to see whether you would reinvent the wheel. So in addition to your own implementation, you could give the one-line solution, just to be safe:

public static int findArray(Integer[] array, Integer[] subArray)
{
    return Collections.indexOfSubList(Arrays.asList(array), Arrays.asList(subArray));
}

It may or may not be a good idea to point out that the example given contains invalid array literals.

adietrich
  • 669
  • 5
  • 10
  • 6
    Even better if you're using Google's Guava: Bytes.indexOf(byte[], byte[]) – Tomer Jul 05 '12 at 20:09
  • What if array is extra large, let's say 1Gb and subArray is located in the very beginning? – Anthony Oct 11 '14 at 20:50
  • 1
    @adietrich what is the complexity of the `indexOfSubList` method ? – deadbug Mar 25 '17 at 08:34
  • The method from Guava uses brute force. – AbuNassar Oct 24 '17 at 19:23
  • 1
    Just a small caveat here for someone understanding this solution that it would not work if the two arrays passed to you are in the form of int[] (or any other primitive type array) which is what the OP seems to have in his job question. – UzumakiL Aug 05 '18 at 12:05
5
Clean and improved code 

public static int findArrayIndex(int[] subArray, int[] parentArray) {
    if(subArray.length==0){
        return -1;
    }
    int sL = subArray.length;
    int l = parentArray.length - subArray.length;
    int k = 0;
    for (int i = 0; i < l; i++) {
        if (parentArray[i] == subArray[k]) {
            for (int j = 0; j < subArray.length; j++) {
                if (parentArray[i + j] == subArray[j]) {
                    sL--;
                    if (sL == 0) {
                        return i;
                    }

                }

            }
        }

    }
    return -1;
}
  • This line for (int i = 0; i < l; i++) { can be like for (int i = 0; i <= l; i++) {. Otherwise it wont find last element. I tried it. – User Aug 08 '18 at 09:06
4

For finding an array of integers in a larger array of integers, you can use the same kind of algorithms as finding a substring in a larger string. For this there are many algorithms known (see Wikipedia). Especially the Boyer-Moore string search is efficient for large arrays. The algorithm that you are trying to implement is not very efficient (Wikipedia calls this the 'naive' implementation).

For your questions:

  1. Yes, such a method should be static
  2. Don't care, that's a question of taste
  3. The null check can be included, or you should state in the JavaDoc that null values are not allowed, or JavaDoc should state that when either parameter is null a NullPointerException will be thrown.
Marc
  • 3,550
  • 22
  • 28
3

Following is an approach using KMP pattern matching algorithm. This solution takes O(n+m). Where n = length of large array and m = length of sub array. For more information, check:

https://en.wikipedia.org/wiki/KMP_algorithm

Brute force takes O(n*m). I just checked that Collections.indexOfSubList method is also O(n*m).

public static int subStringIndex(int[] largeArray, int[] subArray) {
    if (largeArray.length == 0 || subArray.length == 0){
      throw new IllegalArgumentException();
}
    if (subArray.length > largeArray.length){
      throw new IllegalArgumentException();
}

    int[] prefixArr = getPrefixArr(subArray);
    int indexToReturn = -1;

    for (int m = 0, s = 0; m < largeArray.length; m++) {
      if (subArray[s] == largeArray[m]) {
        s++;
      } else {
        if (s != 0) {
          s = prefixArr[s - 1];
          m--;
        }
      }
      if (s == subArray.length) {
        indexToReturn = m - subArray.length + 1;
        break;
      }
    }

    return indexToReturn;
  }

  private static int[] getPrefixArr(int[] subArray) {
    int[] prefixArr = new int[subArray.length];
    prefixArr[0] = 0;

    for (int i = 1, j = 0; i < prefixArr.length; i++) {
      while (subArray[i] != subArray[j]) {
        if (j == 0) {
          break;
        }
        j = prefixArr[j - 1];
      }

      if (subArray[i] == subArray[j]) {
        prefixArr[i] = j + 1;
        j++;
      } else {
        prefixArr[i] = j;
      }

    }
    return prefixArr;
  }
frianH
  • 7,295
  • 6
  • 20
  • 45
Kumar Gaurav
  • 729
  • 3
  • 9
  • 21
3

Well, off the top of my head:

  1. Yes, should be static.

  2. A company complaining about that would not be worth working for.

  3. Yeah, but what would you do? Return? Or throw an exception? It'll throw an exception the way it is already.

  4. I think the main problem is that your code is not very elegant. Too many checks in the inner loop. Too many redundant checks.

Just raw, off the top of my head:

public int findArray(int[] largeArray, int[] subArray) {

    int subArrayLength = subArray.length;

    if (subArrayLength == 0) {
        return -1;
    }

    int limit = largeArray.length - subArrayLength;

    int i=0;

    for (int i = 0; i <= limit; i++) {
        boolean subArrayFound = true;

        for (int j = 0; j < subArrayLength; j++) {
            if (subArray[j] != largeArray[i+j]) {
                subArrayFound = false;
                break;
            }

        /* Sub array found - return its index */
        if (subArrayFound) {
            return i;
        }
    }

    /* Return default value */
    return -1;
}

You could keep that check for the first element so you don't have the overhead of setting up the boolean and the for loop for every single element in the array. Then you'd be looking at

public int findArray(int[] largeArray, int[] subArray) {

    int subArrayLength = subArray.length;

    if (subArrayLength == 0) {
        return -1;
    }

    int limit = largeArray.length - subArrayLength;

    for (int i = 0; i <= limit; i++) {
        if (subArray[0] == largeArray[i]) {
            boolean subArrayFound = true;

            for (int j = 1; j < subArrayLength; j++) {
                if (subArray[j] != largeArray[i+j]) {
                    subArrayFound = false;
                    break;
                }

            /* Sub array found - return its index */
            if (subArrayFound) {
                return i;
            }
        }
    }

    /* Return default value */
    return -1;
}
user987339
  • 10,519
  • 8
  • 40
  • 45
EboMike
  • 76,846
  • 14
  • 164
  • 167
1

A little bit optimized code that was posted before:

public int findArray(byte[] largeArray, byte[] subArray) {
    if (subArray.length == 0) {
        return -1;
    }
    int limit = largeArray.length - subArray.length;
    next:
    for (int i = 0; i <= limit; i++) {
        for (int j = 0; j < subArray.length; j++) {
            if (subArray[j] != largeArray[i+j]) {
                continue next;
            }
        }
        /* Sub array found - return its index */
        return i;
    }
    /* Return default value */
    return -1;
}
Anthony
  • 12,407
  • 12
  • 64
  • 88
0

Here's #indexOf from String:

/**
 * Code shared by String and StringBuffer to do searches. The
 * source is the character array being searched, and the target
 * is the string being searched for.
 *
 * @param   source       the characters being searched.
 * @param   sourceOffset offset of the source string.
 * @param   sourceCount  count of the source string.
 * @param   target       the characters being searched for.
 * @param   targetOffset offset of the target string.
 * @param   targetCount  count of the target string.
 * @param   fromIndex    the index to begin searching from.
 */
static int indexOf(char[] source, int sourceOffset, int sourceCount,
        char[] target, int targetOffset, int targetCount,
        int fromIndex) {
    if (fromIndex >= sourceCount) {
        return (targetCount == 0 ? sourceCount : -1);
    }
    if (fromIndex < 0) {
        fromIndex = 0;
    }
    if (targetCount == 0) {
        return fromIndex;
    }

    char first = target[targetOffset];
    int max = sourceOffset + (sourceCount - targetCount);

    for (int i = sourceOffset + fromIndex; i <= max; i++) {
        /* Look for first character. */
        if (source[i] != first) {
            while (++i <= max && source[i] != first);
        }

        /* Found first character, now look at the rest of v2 */
        if (i <= max) {
            int j = i + 1;
            int end = j + targetCount - 1;
            for (int k = targetOffset + 1; j < end && source[j]
                    == target[k]; j++, k++);

            if (j == end) {
                /* Found whole string. */
                return i - sourceOffset;
            }
        }
    }
    return -1;
}
Kong
  • 8,792
  • 15
  • 68
  • 98
0

First to your possible reasons:

  1. Yes. And the class final with a private constructor.
  2. Shouldn't use this kind of comments at all. The code should be self-explanatory.
  3. You're basically implicitly checking for null by accessing the length field which will throw a NullPointerException. Only in the case of a largeArray.length == 0 and a subArray == null will this slip through.

More potential reasons:

  • The class doesn't contain any function for array manipulations, opposed to what the documentation says.
  • The documentation for the method is very sparse. It should state when and which exceptions are thrown (e.g. NullPointerException) and which return value to expect if the second array isn't found or if it is empty.
  • The code is more complex than needed.
    1. Why is the equality of the first elements so important that it gets its own check?
    2. In the first loop, it is assumed that the second array will be found, which is unintentional.
    3. Unneeded variable and jump (boolean and break), further reducing legibility.
    4. largeArray.length <= i+j is not easy to grasp. Should be checked before the loop, improving the performance along the way.
    5. I'd swap the operands of subArray[j] != largeArray[i+j]. Seems more natural to me.
    6. All in all too long.
  • The test code is lacking more edge cases (null arrays, first array empty, both arrays empty, first array contained in second array, second array contained multiple times etc.).
  • Why is the last test case named testFindArrayExistsVeryComplex?

What the exercise is missing is a specification of the component type of the array parameters, respectively the signature of the method. It makes a huge difference whether the component type is a primitive type or a reference type. The solution of adietrich assumes a reference type (thus could be generified as further improvement), mine assumes a primitive type (int).

So here's my shot, concentrating on the code / disregarding documentation and tests:

public final class ArrayUtils {
    // main method

    public static int indexOf(int[] haystack, int[] needle) {
        return indexOf(haystack, needle, 0);
    }

    // helper methods

    private static int indexOf(int[] haystack, int[] needle, int fromIndex) {
        for (int i = fromIndex; i < haystack.length - needle.length; i++) {
            if (containsAt(haystack, needle, i)) {
                return i;
            }
        }
        return -1;
    }

    private static boolean containsAt(int[] haystack, int[] needle, int offset) {
        for (int i = 0; i < needle.length; i++) {
            if (haystack[i + offset] != needle[i]) {
                return false;
            }
        }
        return true;
    }

    // prevent initialization

    private ArrayUtils() {}
}
Community
  • 1
  • 1
xehpuk
  • 7,814
  • 3
  • 30
  • 54
0

I would suggest the following improvements:

  • make the function static so that you can avoid creating an instance
  • the outer loop condition could be i <= largeArray.length-subArray.length, to avoid a test inside the loop
  • remove the test (largeArray[i] == subArray[0]) that is redundant
Kick Buttowski
  • 6,709
  • 13
  • 37
  • 58
Maurice Perry
  • 32,610
  • 9
  • 70
  • 97
  • Thanks Maurice Perry. Yes, fully agreed with static, not sure why I didn't make it static, may be because was in a hurry. Second point - yes, that's my fault, may be that's what they didn't like. I added `largeArray[i] == subArray[0]` check intentionally for better optimisation and understanding... – Roman Oct 15 '10 at 08:20
0
int findSubArr(int[] arr,int[] subarr)
{
    int lim=arr.length-subarr.length;

    for(int i=0;i<=lim;i++)
    {
        int[] tmpArr=Arrays.copyOfRange(arr,i,i+subarr.length);
        if(Arrays.equals(tmpArr,subarr))
            return i;   //returns starting index of sub array
    }
    return -1;//return -1 on finding no sub-array   
}

UPDATE:

By reusing the same int array instance:

int findSubArr(int[] arr,int[] subarr)
{
    int lim=arr.length-subarr.length;
    int[] tmpArr=new int[subarr.length];
    for(int i=0;i<=lim;i++)
    {
        System.arraycopy(arr,i,tmpArr,0,subarr.length);
        if(Arrays.equals(tmpArr,subarr))
          return i; //returns starting index of sub array
    }
    return -1;//return -1 on finding no sub-array   

}
Emil
  • 13,577
  • 18
  • 69
  • 108
  • 2
    Create a new array during every single iteration?!? – EboMike Oct 15 '10 at 08:09
  • Very short solution and easy to understand, but as EboMike noticed would require creating too many arrays. – Roman Oct 15 '10 at 08:16
  • That eliminates the (relatively cheap) creation of the empty array, but does not eliminate the (expensive) array copy. – Marc Oct 15 '10 at 13:26
0
    byte[] arr1 = {1, 2, 3, 4, 5, 6, 7, 7, 8, 9, 1, 3, 4, 56, 6, 7};
    byte[] arr2 = {9, 1, 3};

    boolean i = IsContainsSubArray(arr1, arr2);

 public static boolean IsContainsSubArray(byte[] Large_Array, byte[] Sub_Array){
    try {
        int Large_Array_size, Sub_Array_size, k = 0;

        Large_Array_size = Large_Array.length;
        Sub_Array_size = Sub_Array.length;

        if (Sub_Array_size > Large_Array_size) {
            return false;
        }
        for (int i = 0; i < Large_Array_size; i++) {
            if (Large_Array[i] == Sub_Array[k]) {
                k++;
            } else {
                k = 0;
            }
            if (k == Sub_Array_size) {
                return true;
            }
        }
    } catch (Exception e) {
    }
    return false;
}
0

Code from Guava:

import javax.annotation.Nullable;

/**
 * Ensures that an object reference passed as a parameter to the calling method is not null.
 *
 * @param reference an object reference
 * @param errorMessage the exception message to use if the check fails; will be converted to a
 *     string using {@link String#valueOf(Object)}
 * @return the non-null reference that was validated
 * @throws NullPointerException if {@code reference} is null
 */
public static <T> T checkNotNull(T reference, @Nullable Object errorMessage) {
    if (reference == null) {
        throw new NullPointerException(String.valueOf(errorMessage));
    }
    return reference;
}


/**
 * Returns the start position of the first occurrence of the specified {@code
 * target} within {@code array}, or {@code -1} if there is no such occurrence.
 *
 * <p>More formally, returns the lowest index {@code i} such that {@code
 * java.util.Arrays.copyOfRange(array, i, i + target.length)} contains exactly
 * the same elements as {@code target}.
 *
 * @param array the array to search for the sequence {@code target}
 * @param target the array to search for as a sub-sequence of {@code array}
 */
public static int indexOf(int[] array, int[] target) {
    checkNotNull(array, "array");
    checkNotNull(target, "target");
    if (target.length == 0) {
        return 0;
    }

    outer:
    for (int i = 0; i < array.length - target.length + 1; i++) {
        for (int j = 0; j < target.length; j++) {
            if (array[i + j] != target[j]) {
                continue outer;
            }
        }
        return i;
    }
    return -1;
}
Simba
  • 1
0

I would to do it in three ways:

  1. Using no imports i.e. using plain Java statements.

  2. Using JAVA core APIs - to some extent or to much extent.

  3. Using string pattern search algorithms like KMP etc. (Probably the most optimized one.)

1,2 and 3 are all shown above in the answers. Here is approach 2 from my side:

public static void findArray(int[] array, int[] subArray) {

        if (subArray.length > array.length) {
            return;
        }

        if (array == null || subArray == null) {
            return;
        }

        if (array.length == 0 || subArray.length == 0) {
            return;
        }

        //Solution 1
        List<Integer> master = Arrays.stream(array).boxed().collect(Collectors.toList());
        List<Integer> pattern = IntStream.of(subArray).boxed().collect(Collectors.toList());

        System.out.println(Collections.indexOfSubList(master, pattern));

        //Solution2
        for (int i = 0; i <= array.length - subArray.length; i++) {
            String s = Arrays.toString(Arrays.copyOfRange(array, i, i + subArray.length));

            if (s.equals(Arrays.toString(subArray))) {
                System.out.println("Found at:" + i);
                return;
            }
        }
        System.out.println("Not found.");
    }
UzumakiL
  • 373
  • 3
  • 9
0

Using java 8 and lambda expressions:

String[] smallArray = {"1","2","3"};
final String[] bigArray = {"0","1","2","3","4"};
boolean result = Arrays.stream(smallArray).allMatch(s -> Arrays.stream(bigArray).anyMatch(b -> b.equals(s)));

PS: is important to have finalString[] bigArray for enclosing space of lambda expression.

octavian09
  • 139
  • 1
  • 3
0

FYI: if the goal is simply to search wether an array y is a subset of an array x, we can use this:

val x = Array(1,2,3,4,5)
val y = Array(3,4,5)
val z = Array(3,4,8)
x.containsSlice(y) // true
x.containsSlice(z) // false
Boris
  • 1,093
  • 2
  • 14
  • 22