3

I am attempting to expand a function to find the number of integer matches through Binary Search by resetting the high variable, but it gets stuck in a loop. I am guessing a workaround would be to duplicate this function to obtain the last index to determine the number of matches, but I do not think this would be such an elegant solution.

From this:

public static Matches findMatches(int[] values, int query) {
    int firstMatchIndex = -1;
    int lastMatchIndex = -1;
    int numberOfMatches = 0;

    int low = 0;
    int mid = 0;
    int high = values[values.length - 1];
    boolean searchFirst = false;

    while (low <= high){
        mid = (low + high)/2;

        if (values[mid] == query && firstMatchIndex == -1){
            firstMatchIndex = mid;

            if (searchFirst){
                high = mid - 1;
                searchFirst = false;
            } else { 
                low = mid + 1;
            }

        } else if (query < values[mid]){
            high = mid - 1;
        } else {
            low = mid + 1;
        }           
    }

    if (firstMatchIndex != -1) { // First match index is set
        return new Matches(firstMatchIndex, numberOfMatches);
    }
    else { // First match index is not set
        return new Matches(-1, 0); 
    }
}

To something like this:

public static Matches findMatches(int[] values, int query) {
    int firstMatchIndex = -1;
    int lastMatchIndex = -1;
    int numberOfMatches = 0;

    int low = 0;
    int mid = 0;
    int high = values[values.length - 1];
    boolean searchFirst = false;

    while (low <= high){
        mid = (low + high)/2;

        if (values[mid] == query && firstMatchIndex == -1){
            firstMatchIndex = mid;

            if (searchFirst){
                high = values[values.length - 1]; // This is stuck in a loop
                searchFirst = false;
            } 
        } else if (values[mid] == query && lastMatchIndex == -1){
            lastMatchIndex = mid;

            if (!searchFirst){
                high = mid - 1;
            } else { 
                low = mid + 1;
            }
        } else if (query < values[mid]){
            high = mid - 1;
        } else {
            low = mid + 1;
        }

    }

    if (firstMatchIndex != -1) { // First match index is set
        return new Matches(firstMatchIndex, numberOfMatches);
    }
    else { // First match index is not set
        return new Matches(-1, 0); 
    }
}
imparante
  • 503
  • 9
  • 21
  • 1
    How about using binary search to find the index of a given number? Assuming -1 is returned if value is not found, you could use that index to find number of duplicates? E.g. Binary search returns index 5 when searching for the number '9', so I would search left & right of the index 5 for duplicates, and stop once there is no more duplicate. Number of matches would be `rightIndex - leftIndex + 1`, since the array of values is sorted. – almightyGOSU Jul 20 '15 at 03:39

5 Answers5

3

There is a problem with your code:

high = values[values.length - 1];

should be

high = values.length - 1;

Also you do not need variables like numberOfMatches and searchFirst, we can have rather simple solution.

Now coming to the problem,I understand what you want I think Binary Search is appropriate for such query.

The Best way to do the required is once a match is found you just go forward and backward from that index until a mismatch occurs and this would be both elegant and efficient in calculating the firstMatchIndex and numberOfMatches.

So your function should be:

public static Matches findMatches(int[] values, int query) 
{
 int firstMatchIndex = -1,lastMatchIndex=-1;
 int low = 0,mid = 0,high = values.length - 1;
 while (low <= high)
 {
      mid = (low + high)/2;

      if(values[mid]==query)
      {
          lastMatchIndex=mid;
          firstMatchIndex=mid;
          while(lastMatchIndex+1<values.length&&values[lastMatchIndex+1]==query)
           lastMatchIndex++;
          while(firstMatchIndex-1>=0&&values[firstMatchIndex-1]==query)
           firstMatchIndex--; 
          return new Matches(firstMatchIndex,lastMatchIndex-firstMatchIndex+1); 
      }
      else if(values[mid]>query)
       high=mid-1;
      else low=mid+1;
 }
 return new Matches(-1,0);
}          
Sumeet
  • 8,086
  • 3
  • 25
  • 45
  • Thanks for the catch Dante! I see your linear approach. It would work well in a few duplicates and small array. I continued with my approach and found a solution below. – imparante Jul 20 '15 at 05:39
0

Couldn't you just use something like a set to find duplicates?

Something like this:

package example;

import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;

public class DuplicatesExample {

    public static void main(String[] args) {
        String[] strings = { "one", "two", "two", "three", "four", "five", "six", "six" };
        List<String> dups = getDups(strings);
        System.out.println("DUPLICATES:");
        for(String str : dups) {
            System.out.println("\t" + str);
        }
    }

    private static List<String> getDups(String[] strings) {
        ArrayList<String> rtn = new ArrayList<String>();
        HashSet<String> set = new HashSet<>();
        for (String str : strings) {
            boolean added = set.add(str);
            if (added == false ) {
                rtn.add(str);
            }
        }
        return rtn;
    }

}

Output:

DUPLICATES:
    two
    six
John
  • 3,458
  • 4
  • 33
  • 54
  • (You could also return a set from the getDups method to get the distinct duplicates) – John Jul 20 '15 at 03:37
0

I have split your problems into two parts - using binary search to find a number and counting the number of matches. The first part is resolved by the search function while the second part is resolved by the findMatches function:

public static Matches findMatches(int[] values, int query) {

    int leftIndex = -1;
    int rightIndex = -1;
    int high = values.length - 1;

    int matchedIndex = search(values, 0, high, query);

    //if at least one match
    if (matchedIndex != -1) {

        //decrement upper bound of left array
        int leftHigh = matchedIndex - 1;
        //increment lower bound of right array
        int rightLow = matchedIndex + 1;

        //loop until no more duplicates in left array
        while (true) {

            int leftMatchedIndex = search(values, 0, leftHigh, query);

            //if duplicate found
            if (leftMatchedIndex != -1) {
                leftIndex = leftMatchedIndex;
                //decrement upper bound of left array
                leftHigh = leftMatchedIndex - 1;
            } else {
                break;
            }
        }

        //loop until no more duplicates in right array
        while(true){
            int rightMatchedIndex = search(values, rightLow, high, query);

            //if duplicate found
            if(rightMatchedIndex != -1){
                rightIndex = rightMatchedIndex;
                //increment lower bound of right array
                rightLow = rightMatchedIndex + 1;
            } else{
                break;
            }

        }

        return new Matches(matchedIndex, rightIndex - leftIndex + 1);

    }

    return new Matches(-1, 0);

}

private static int search(int[] values, int low, int high, int query) {

    while (low <= high) {
        int mid = (low + high) / 2;

        if (values[mid] == query) {
            return mid;
        } else if (query < values[mid]) {
            high = mid - 1;
        } else {
            low = mid + 1;
        }
    }

    return -1;

}
PythaLye
  • 314
  • 1
  • 8
  • @Gosu, did I correctly implement your algorithm? It seems complex and could be potentially improved. – PythaLye Jul 20 '15 at 06:53
0

I found a solution after correcting a mistake with resetting high variable that caused an infinite loop.

public static Matches findMatches(int[] values, int query) {
    int firstMatchIndex = -1;
    int lastMatchIndex = -1;
    int numberOfMatches = 0;

    int low = 0;
    int mid = 0;
    int high = values.length - 1;

    while (low <= high){
        mid = (low + high)/2;

        if (values[mid] == query && firstMatchIndex == -1){

            firstMatchIndex = mid;
            numberOfMatches++;
            high = values.length - 1;
            low = mid;

        } else if (values[mid] == query && (lastMatchIndex == -1 || lastMatchIndex != -1)){

            lastMatchIndex = mid;
            numberOfMatches++;

            if (query < values[mid]){
                high = mid - 1;
            } else { 
                low = mid + 1;
            }

        } else if (query < values[mid]){
            high = mid - 1;
        } else {
            low = mid + 1;
        }
    }

    if (firstMatchIndex != -1) { // First match index is set
        return new Matches(firstMatchIndex, numberOfMatches);
    }
    else { // First match index is not set
        return new Matches(-1, 0); 
    }
}
imparante
  • 503
  • 9
  • 21
  • The output of this program does not show the correct results. (lastMatchIndex == -1 || lastMatchIndex != -1) condition is always evaluated to be true. @jruser2120512 – PythaLye Jul 20 '15 at 06:52
  • @jruser2120512 your code is not showing the correct output my friend. – Sumeet Jul 20 '15 at 08:28
  • @Dante At that line, the lastMatchIndex will be checked in a loop after the firstMatchIndex has been found. I am passing these arguments: int[] values = {0, 1, 2, 3, 4, 4, 5, 6, 7, 8, 8, 8, 9}; // A pre-sorted array of integers. int query = 8; // The integer to search for. I am expect index 9 and 3 matches, which I get from my code. – imparante Jul 21 '15 at 06:24
  • 1
    @jruser2120512 Try this {1,2,3,3,3,4,5,5,5,5,6,7,7} with query 5, it is still giving wrong result my friend. – Sumeet Jul 21 '15 at 06:47
  • @Dante I see now. My approach with this continued binary search isn't going to work. Your solution with linear search seems to be best! – imparante Jul 22 '15 at 03:13
0

Not having any knowledge of the data other than sorted a priori is tough. See this: Binary Search O(log n) algorithm to find duplicate in sequential list?

This will find the first index of duplicates of k in a sorted array. Of course this is related to knowing the value of duplicate first but very useful when it is known.

    public static int searchFirstIndexOfK(int[] A, int k) {

     int left = 0, right = A.length - 1, result = -1;
     // [left : right] is the candidate set.
     while (left <= right) {
       int mid = left + ((right - left) >>> 1); // left + right >>> 1;
       if (A[mid] > k) {
         right = mid - 1;
       } else if (A[mid] == k) {
         result = mid;
         right = mid - 1; // Nothing to the right of mid can be
                                               // solution.
      } else { // A[mid] < k
      left = mid + 1;
      }
     }
     return result;
    }

This will find a dupe in log(n) time but is fragile in that the data must be sorted as well as increasing by 1 and in the range 1..n.

static int findeDupe(int[] array) {
int low = 0;
int high = array.length - 1;
while (low <= high) {
    int mid = (low + high) >>> 1;
    if (array[mid] == mid) {
    low = mid + 1;

    } else {
    high = mid - 1;

    }

}
System.out.println("returning" + high);
return high;

}
Community
  • 1
  • 1
Droid Teahouse
  • 893
  • 8
  • 15