1

Lets say I have range of integers [l, r) and a function check(int idx) which satisfies the following condition: there is an index t (l <= t < r) such that for each i (l <= i <= t) check(i) == true and for each j (t < j < r) check(j) == false. Is there a standard way to find index t? Standard binary_search() needs comparator that takes two arguments, so it can't be applied here (correct me if I'm wrong).

ArthurBesse
  • 325
  • 1
  • 12
  • first of all, `std::binary_search` does not do what you need. It just checks whether a range contains the element - it does not return the index. https://stackoverflow.com/questions/446296/where-can-i-get-a-useful-c-binary-search-algorithm – Sergey Kolesnik Dec 13 '21 at 08:34
  • `t` is known, so even if `check` only takes one argument, you have 2 indexes to compare. – super Dec 13 '21 at 08:36
  • 7
    [`std::partition_point`](https://en.cppreference.com/w/cpp/algorithm/partition_point)? – Jarod42 Dec 13 '21 at 08:41
  • Along with using `std::distance` after getting the partition point. That will find you the index. So given your description, the solution is a one or two line function. – PaulMcKenzie Dec 13 '21 at 08:44
  • I’m not sure if std::partition works on a range of integers – vdavid Dec 13 '21 at 09:25
  • Can we assume that the range is *already* partitioned or that's something the function needs to check too? – Bob__ Dec 13 '21 at 09:30
  • @Bob__ yes, we can assume that. – ArthurBesse Dec 13 '21 at 11:05

1 Answers1

3

Assuming you are searching for a continuous range of integers (and not, for example, an indexed array) I would suggest a dichotomic search:

int find_t(int l, int r) {
    // Preconditions
    assert(check(l) == true);
    //assert(check(r) == false); // this precondition is not mandatory

    int max_idx_true = l; // highest known integer which satisfies check(idx) == true
    int min_idx_false = r; // lowest known integer which satisfies check(idx) == false

    while (max_idx_true+1 < min_idx_false) {
        
        int mid_idx = (max_idx_true+min_idx_false)/2;

        if (check(mid_idx)) max_idx_true = mid_idx;
        else min_idx_false = mid_idx;
    }

    int t = max_idx_true;

    // Postconditions
    assert(check(t) == true);
    assert(t+1 == r || check(t+1) == false);

    return t;
}

This algorithm narrows the closest integers where check(idx) is true and the next one is false. In your case, you are looking for t which corresponds to max_idx_true.

It should be noted that the following preconditions must be satisfied for this to work:

  • l < r
  • check(l) is true
  • for any idx, if check(idx) is true then check(idx-1) is always true
  • for any idx, if check(idx) is false then check(idx+1) is always false

Below is a source code example for testing the algorithm and output lines to better understand how it works. You can also try it out here.

#include <iostream>
#include <cassert>
using namespace std;

// Replace this function by your own check
bool check(int idx) {
    return idx <= 42;
}

int find_t(int l, int r) {
    assert(check(l) == true);
    //assert(check(r) == false); // this precondition is not mandatory

    int max_idx_true = l; // highest known integer which satisfies check(idx) == true
    int min_idx_false = r; // lowest known integer which satisfies check(idx) == false
    int n = 0; // Number of iterations, not needed but helps analyzing the complexity

    while (max_idx_true+1 < min_idx_false) {
        ++n;
        
        int mid_idx = (max_idx_true+min_idx_false)/2;

        // Analyze the algorithm in detail
        cerr << "Iteration #" << n;
        cerr << " in range [" << max_idx_true << ", " << min_idx_false << ")";
        cerr << " the midpoint " << mid_idx << " is " << boolalpha << check(mid_idx) << noboolalpha;
        cerr << endl;
    
        if (check(mid_idx)) max_idx_true = mid_idx;
        else min_idx_false = mid_idx;
    }

    int t = max_idx_true;

    assert(check(t) == true);
    assert(t+1 == r || check(t+1) == false);

    return t;
}

int main() {
    // Initial constants
    int l = 0;
    int r = 100;

    int t = find_t(l, r);

    cout << "The answer is " << t << endl;

    return 0;
}

The main advantage of the dichomotic search is that it finds your candidate with a complexity of only O(log2(N)).

For example if you initialize int l = -2000000000 and int r = 2000000000 (+/- 2 billions) you need to known the answer in about 4 billion numbers, yet the number of iterations will be 32 at worst.

vdavid
  • 2,434
  • 1
  • 14
  • 15