0

After I moved from C++ to java, I have trouble writing correct code for matching std::upper_bound and std::lower_bound logic from C++ Vector.

So it causes problems in interviews if lower bound is being used in the coding questions.


Example

upper_bound gives the the max index of element in array.
Suppose Arr[]= {1,2,3,4,4} : Upper bound for 4 -> 5 index Similarly lower bound for 4 -> 4 index

For cases where element not found:
Similarly upper bound for element 6 in the above array will be index 5
Lower bound for element 0 in above array will be index 0


I just wanted to ask how to come up with the upper bound and lower bound code without hit and trial method. I just need a good explanation.

Upper bound:

It returns an iterator pointing to the first element in the range [first, last) that is greater than value, or last if no such element is found. The elements in the range shall already be sorted or at least partitioned with respect to val.

Lower Bound

The lower_bound() method in C++ is used to return an iterator pointing to the first element in the range [first, last) which has a value not less than val. This means that the function returns the index of the next smallest number just greater than or equal to that number. If there are multiple values that are equal to val, lower_bound() returns the index of the first such value.

My code:

static int lowerbound(int size, int arr[],int element){
    if(arr[size -1]<=element)
        return size-1;

    if(element<arr[0])
        return 0;

    int low=0;
    int high=size-1;
    while(low<high){
        int mid=(low+high)/2;
        if(arr[mid]<element) {
            low = mid+1;
        }else{
            high=mid;
        }
//        System.out.println(low+" "+mid+ " "+high);
    }

    return low;
}
public static void main (String[] args) {
    int arr[]={1,2,6,10,10,13};
    System.out.println(lowerbound(6,arr,14));//5
    System.out.println(lowerbound(6,arr,11));//5
    System.out.println(lowerbound(6,arr,10));//3
    System.out.println(lowerbound(6,arr,3));//2

    int arr2[]={1,2,6,10,10,10,13};
    System.out.println(lowerbound(7,arr2,14));//6
    System.out.println(lowerbound(7,arr2,11));//6
    System.out.println(lowerbound(7,arr2,10));//3
    System.out.println(lowerbound(7,arr2,3));//2
}

It took me too much time and debugging to arrive at the correct algorithm. Can anybody explain how to remember or figure out the logic easily?

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
Freez
  • 53
  • 9
  • What's an upper and lower bound in this case? I tried to interpret it as upper/lower bound on runtime complexity, of numerical types, and of type constraints, but I still don't know what you mean. – that other guy Jul 29 '21 at 22:23
  • Could you explain what is "upper" and "lower bound", please? – 0009laH Jul 29 '21 at 22:23
  • Updated the question @the other guy – Freez Jul 29 '21 at 22:29
  • Java has no such methods for finding this. How do we come up with code? Unit tests, most commonly, followed by trial and error – OneCricketeer Jul 29 '21 at 22:31
  • I am not able to understand the logic behind upper bound and lower bound. Though it is such a simple code in theory. Like any kind of pseudo code to understand it would be nice. This is the only thing i find complex across all the questions i have worked on. – Freez Jul 29 '21 at 22:35
  • The logic is easy. List out all the numbers along with their index. Given `n` to find, scan all numbers for `n`. If the number is not found, return 0 for lower and array-size for upper. If you find `n` and are looking for lower, return its index. If you are looking for upper keep scanning until the next number does not match and return the max found index. – OneCricketeer Jul 29 '21 at 22:38
  • 1
    It would be better if you [edit] your question with code you _did write_ and we can show you where you went wrong, or suggest improvements. Refer [help] on asking good questions – OneCricketeer Jul 29 '21 at 22:41
  • 1
    @OneCricketeer the O(N) solution is easy to write i was taliing about O(logn) solution – Freez Jul 29 '21 at 23:27
  • If you find yourself using trial and error in this kind of coding, a possible reason is that you're not mindful of your invariants. Guessing happens when you e.g. just think "lower" instead of "strictly lower" vs "lower or equal". For example, you shouldn't have to guess whether it's `low = mid` or `low = mid + 1` because you know that `arr[mid] < element` means `low` must be *strictly* greater than mid. – that other guy Jul 29 '21 at 23:41
  • https://stackoverflow.com/questions/39416560/how-can-i-simplify-this-working-binary-search-code-in-c/39417165#39417165 – Matt Timmermans Jul 30 '21 at 02:05
  • Sorry, your original post said nothing about run-time complexity – OneCricketeer Jul 30 '21 at 05:16

1 Answers1

0

Lower bound is not binary search, it uses binary search to find the index of the first element in the range [first, last) which has a value not less than val.

All code is untested and use at own risk.

static int lowerbound(int size, int arr[],int element) {
    if(arr[size -1]<=element)
        return size-1;

    if(element<arr[0])
        return 0;

    int low=0;
    int high=size-1;
    while(low<high){
        int mid=(low+high)/2;
        if(arr[mid]<element) {
            low = mid+1;
        }else{ // binary search could short cut here with extra check for equality
            high=mid;
        } 
    }

    return low;
}

First a couple of bugs.

return an iterator pointing to the first element in the range [first, last) which has a value not less than val.

    if(arr[size -1]<=element)
        return size-1;

with

    int arr1[]={1};
    System.out.println(lowerbound(1,arr,2));//1

This returns 0 (zero) where it should return 1, the caller should then check the result. The correct way so be this, if the last index is not greater or equal return the one beyond end index

    if(!(arr[size-1] >= element))
        return size;

now

    System.out.println(lowerbound(6,arr,14));//5->6
    System.out.println(lowerbound(7,arr2,14));//6->7

returns 6 and 7 as they should.

Potential overflow

        int mid=(low+high)/2;

if low + high > int max then the result overflows and you get a wrong result, you can use

        int mid=low+((high-low)/2);

The first two if's are not needed to understand the essence in the algorithm, by changing

    int low=0;
    int high=size-1;

to

    int low=0;
    int high=size; // the algorithm insures that arr[high] never will be accessed

You will get the same behaviour without the ifs.

static int lowerbound(int size, int arr[], int element) {
    int low = 0;
    int high = size;

    while (low<high) {
        int mid = low+((high-low)/2); // mid point rounded down, no overflow

        if (arr[mid]<element) {
            low = mid+1; // as mid was not the looked for value we can ignore it next time
        } else {
            high = mid;  // as mid is rounded down high will be lower than before
        }
    }

    return low;
}

As conclusion using the open ended array [low, high) the algorithm must do the following things when using binary search

  • don't overflow
  • reduce the search range by approximate half
  • reduce the search range by at least 1

To test it you can do the following

static int midpoint(int low, int high) {
    int mid = low+((high-low)/2); // mid point rounded down, no overflow
    return mid;
}

static int midpointtestoverflow() {
    int high = integer.MAX_VALUE;
    int low = high-1;

    assert low == midpoint(low, high);
}

Reducing the search range by approximately half is done by using the midpoint.

Reducing the search range by at least 1 is a somewhat lose prove, based on the rounding down of midpoint.

  • When the range size is zero we are at the correct index and can end.
  • When the range size (high-low) is larger than 1 there is always mid point within the range that reduce the size.
  • When the range size is 1 either low or high will be equal to mid and due to the rounding down it will be low.
  • Therefore when the lower part is discarded we will need to use mid+1 as low could be equal to mid.
  • High is never equal to mid so setting high = mid reduces the range.

Further tests that you can use to check the correctness

  • size zero array
  • size one array with element less, equal and greater than the one value
  • size two array with element between the two values (all other cases should be covered by the size one array)

The slightly paranoid will assume the users lie to the algorithm and will check the length of the array.

static int lowerbound(int size, int arr[], int element) {
    int low = 0;
    int high = size;
    assert size >= 0;
    assert size < arr.length;

or if the user is only allowed to search the whole array, you can dump the size and just assign high = arr.length.

Surt
  • 15,501
  • 3
  • 23
  • 39
  • Nice approach. For such a small code, it much have taken some time to solve this checking all the test cases.As you know c++ provides a function lower_bound for solving these types of questions but java doesn't. So it becomes complex for java developers to write code when a big problem is given and you have to code this small section too – Freez Aug 24 '21 at 19:36
  • @Freez I'm sure the actual tests in the std library are more comprehensive than this. – Surt Aug 25 '21 at 21:13