1

I want to find a target value 4 firstly appeared place in a sequence [1, 1, 2, 3, 3, 3, 3, 4, 4, 4, 5, 5, 6]. when I use java.util.Arrays.binaySearch, it returns index is 9, but I expected 7.

I look java.util.Arrays.binaySearch source code

and I found some comments:

If the array contains multiple elements with the specified value, there is no guarantee which one will be found.

So how to implement a lower_bound binary search algorithm in Java, which returns the target value firstly appeared place.

Note: The lower_bound concept comes from C++, but I don't understand C++ well.

douyu
  • 2,377
  • 2
  • 14
  • 27
  • 1
    To find the first 4, treat 4 as greater than the target value, and 3 as less than the target value. **Don't** terminate the algorithm early when a value in the array is 4. – user3386109 Aug 20 '19 at 17:09
  • 1
    Just iterate backwards until you find a different value, storing the previous index (which you return at the end of the algorithm). – Jacob G. Aug 20 '19 at 17:27
  • 4
    @JacobG. Worst case, that results in an O(n) algorithm, which defeats the purpose of a binary search. – user3386109 Aug 20 '19 at 17:32

3 Answers3

3

I think the implementation below will do the job correctly:

int firstOccurrence(int[] sequence, int x) {
    int min = 0;
    int max = sequence.length - 1;

    int result = -1;

    while (min <= max)
    {
        // find the mid value and compare it with x
        int mid = min + ((max - min) / 2);

        // if x is found, update result and search towards left
        if (x == sequence[mid]) {
            result = mid;
            max = mid - 1;
        } else if (x < sequence[mid]) {
            // discard right half
            max = mid - 1;
        } else {
            // discard left half
            min = mid + 1;
        }
    }

    // return the leftmost index equal to x or -1 if not found
    return result;
}

Edit:

Change the way to compute mid to avoid overflow with larger sums

// Previously, can overflow since we add two integer
int mid = (min + max) / 2;

// Now
int mid = min + ((max - min) / 2);

// Another way using the unsigned right shift operator
int mid = (low + high) >>> 1;
// The left operands value (low + high) is moved right
// by the number of bits specified (2 in this case) by the right operand and
// shifted values are filled up with zeros.
// The >>> treats the value as unsigned
Ben Souchet
  • 1,450
  • 6
  • 20
1

Building on this answer to another binary search question:

How can I simplify this working Binary Search code in C?

This is a search that is equivalent to lower_bound from C++. It returns the number of elements smaller than the value you want to find. That would be the index of the first occurrence, or where one would be inserted if there is no occurrence:

int numSmaller(int[] seq, int valueToFind)
{
    int pos=0;
    int limit=seq.length;
    while(pos<limit)
    {
        int testpos = pos+((limit-pos)>>1);

        if (seq[testpos]<valueToFind)
            pos=testpos+1;
        else
            limit=testpos;
    }
    return pos;
}

Note that we only need to do one comparison per iteration.

The linked answer highlights several advantages of writing a binary search this way.

Matt Timmermans
  • 53,709
  • 3
  • 46
  • 87
  • If all elements in the sequence is greater than valueToFind you will return 0, this is an error, and if all elements in the sequence is smaller than valueToFind you will return length + 1, not a good return value. – Ben Souchet Aug 21 '19 at 05:59
  • 1
    If all elements are greater, I will return zero. That is what it's supposed to do. If all elements are less, then I will return length, which is also what it's supposed to do. – Matt Timmermans Aug 21 '19 at 06:05
  • And shifting right to do the division is not faster since the compiler will optimize `/ 2` but cannot do much with bit shifting, and not all programmers will understand this operation. – Ben Souchet Aug 21 '19 at 06:08
  • Shifting to the right is occasionally faster, but /2 also works just fine, even though it cannot be optimized to a shift unless the compiler can figure out that the operand is always positive. The important part of that expression is to avoid overflow – Matt Timmermans Aug 21 '19 at 06:09
  • So if all values in the sequence are greater than valueToFind you return 0 and how can you tell the difference between a 0 return because your the first element in the sequence is valueToFind and when we didn’t find the value? In the OP’s post the goal is to find the index when the value firstly appeared. – Ben Souchet Aug 21 '19 at 06:25
  • According to this [post](https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html) and this [tutorial](https://docs.oracle.com/javase/tutorial/java/nutsandbolts/op3.html) here you should use `>>>` rather than using `>>` in java. – douyu Aug 21 '19 at 07:18
  • 1
    @BSO you just check, of course. The OP asked for an equivalent of C++ lower_bound, and that is what I provided. I also explained exactly what it does an gave it a name that makes it obvious. It turns out that that this contract is also the most convenient and efficient for cases in which you specifically want to search for the position of the *first* occurrence. Think of some realistic use cases and try it. – Matt Timmermans Aug 21 '19 at 12:11
  • @Aoerz, do you understand why `>>>` is recommended? My search is not broken, since `pos + ((limit-pos)>>1)` will never overflow. `(pos + limit) >>> 1` also works, even though it *does* overflow. I think it's a little too clever. – Matt Timmermans Aug 21 '19 at 12:18
0

It think it will help you

public static boolean binarysearch(int[] data, int target, int low, int high){
    if(low>high){
        System.out.println("Target not found");
        return false;}
        else{
                int mid=(low+high)/2;
                if(target==data[mid])
                    return true;
                else if(target<data[mid])
                    return binarysearch(data, target, low, high);
                else
                    return binarysearch(data, target, low, high);
                }
}
Ijlal H
  • 69
  • 1
  • 8
  • Do you think a person asking how to implement a lower bound in java would benefit from a recursive code instead of iterative one? – n j Jan 14 '22 at 19:08