Efficient implementation Binary Search

Question

I have an algorithmic test about implementing of binary search that works in a maximum amount of time of 2 seconds.

First, I've implemented a recursive version of binary search, but it was taking almost 3.6 seconds to finish in some test cases. Then, I changed it to an iterative version, but it takes 2.6 seconds in the same test case. However, I think that using a while loop is a reason why it takes a lot of time.

My question is: What do I need to improve to make it take a maximum 2 seconds?

#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

int iterBinarySearch(vector<int> A, int low, int high, int key) {
    int mid;
    while (low <= high) {
        mid = low + ((high - low)/2);
        if (key < A[mid]) {
            high = mid -1;
        } else if (key > A[mid]) {
            low = mid +1;
        } else {
            return mid;
        }
    }
    return -1;
}

int main() {

    vector<int>dict;
    vector<int>keys;

    int dictSize;
    cin >> dictSize;
    while (dictSize--) {
        int val;
        cin >> val;
        dict.push_back(val);
    }

    int keysSize;
    cin >> keysSize;
    while (keysSize--) {
        int val;
        cin >> val;
        keys.push_back(val);
    }

    sort(dict.begin(), dict.end());
    int size = (int)dict.size() -1;
    for(int i = 0; i< keys.size(); ++i) {
        if ((dict[0] > keys[i]) || (dict[size] < keys[i])) {
            cout << "-1" << ' ';
        } else {
            int res = iterBinarySearch(dict, 0, size, keys[i]);
            cout << res << ' ';
        }
    }
    return 0;
}

Two things to try: turn on optimizations, and don't take the `vector` by value (take by `const&`) — Justin, Nov 01 '17 at 08:00
Also call `vector::reserve()` as soon as you have read `dictSize` and `keysSize`. — gudok, Nov 01 '17 at 08:04
Why don't you use [binary_search](http://www.cplusplus.com/reference/algorithm/binary_search/)? — Ahmad Khan, Nov 01 '17 at 08:05
@Justin do you mean to change the function parameter to `vector &A` — Ennabah, Nov 01 '17 at 08:11
@StoryTeller Thanks for clarification, just to make it clear I get it as I'm not that expertise in cpp, does it mean to do: `vector const& dict;`? If not, please show me an example of what is correct. and sorry for misunderstanding. — Ennabah, Nov 01 '17 at 08:20
@MEnnabah - No. You need to modify the function parameter. `vector const &A`. It will bind to `dict` just fine (no need to change anything else). And an optimizing compiler can do wonders on the assumption of `const`-ness. — StoryTeller - Unslander Monica, Nov 01 '17 at 08:22
How big are the vectors in the failing cases? Do you have the test cases available locally? Can you run them with a profiler to see where the time is going? — Useless, Nov 01 '17 at 08:24
Off-topic: Negative indices are invalid anyway - using `unsigned int` for `high` and `low` would reflect this in the API... — Aconcagua, Nov 01 '17 at 08:25
@Useless I don't have access to the test cases actually but I have access to the constraints: the vector size is at most `100K` elements. and the and each element is at most `10^9`. — Ennabah, Nov 01 '17 at 08:27
@Aconcagua please don't gratuitously advertise unsigned integers (see many other questions on this site for debates on the subject). — Marc Glisse, Nov 01 '17 at 09:24
@MarcGlisse `low` and `high` are indices - valid range of indices is non-negative numbers (well, subset of, at least). Do you consider reflecting this already in the data type a bad idea? — Aconcagua, Nov 01 '17 at 09:35
@Aconcagua please search this site for discussions on the subject. Reflecting non-negativeness in the data type would not be a bad idea, but the C/C++ unsigned types do not fit that bill as well as you believe. — Marc Glisse, Nov 02 '17 at 07:36
@MarcGlisse So I digged a little as you adviced, best I found was [here](https://stackoverflow.com/a/41999668/1312382) and [here](https://stackoverflow.com/a/3261019/1312382). The unsigned data types *do* fit the bill very well as far as *I* define it, which is just expressing "one shall not pass negative values", *not* trying to make it impossible... If public API, range checks would have to be added anyway. — Aconcagua, Nov 03 '17 at 11:03

score 3 · Accepted Answer · answered Nov 01 '17 at 08:40

Only two things are obviously wasteful:

int iterBinarySearch(vector<int> A, int low, int high, int key) copies the vector (which could have 100,000 elements from your comment), whereas

int iterBinarySearch(const vector<int> &A, int low, int high, int key) (or any of the other const-ref spellings) will search your original vector directly, with no copying
your initial push_back to the dict and key vectors is wasteful when you know the size in advance: because you didn't tell the vector how large it's going to be, it has to keep resizing and copying. Just add
```
    cin >> dictSize;
    dict.reserve(dictSize); // grow to the correct size just once
    while (dictSize--) {
      int val;
      cin >> val;
      dict.push_back(val);
    }
```
and the same for the keys.

Now, apart from these two things that jump out, you should ideally try to profile your code rather than just guessing where the slowness is.

@MEnnabah Did you retry the recursive approach? It should work just fine with the vector reference ;) — grek40, Nov 01 '17 at 09:03
@grek40 Actually, no, I didn't retry the recursive approach. But it would be great if it worked in the same duration for all test cases, I'll consider retrying it and check the amount of time it takes. — Ennabah, Nov 01 '17 at 10:35

Gor · Answer 2 · 2017-11-01T08:18:59.127

2

1. The main problem is when you pass dict argument as value.

Just pass it as const reference.

int iterBinarySearch(const vector<int> &A, int low, int high, int key) {
    // your code 
}

2. Also try to change this line

mid = low + ((high - low)/2);

to

mid = (low + high)/2;

NOTE: Make second change only if your vector size is not bigger than INT_MAX / 2.

edited Nov 01 '17 at 08:18

answered Nov 01 '17 at 08:04

Gor

2,808
6
25
46

1

Sometimes that helps prevent overflow. – Gaurav Sehgal Nov 01 '17 at 08:07
I understand, but if he cares about performance, it can help. – Gor Nov 01 '17 at 08:08
9

When dealing with performance, don't optimize by guessing. The calculation of `mid` is unlikely to be a major bottleneck. – StoryTeller - Unslander Monica Nov 01 '17 at 08:10
@Gor the second point is not a correct thing to do, since I'm changing the `low` and `high` values, I need to move the `mid`. If i did this, it will always calculate the `mid` with entire vector size, and this is not correct. – Ennabah Nov 01 '17 at 08:24
what is the point of changing the calculation of `mid`? If this is for performance, then it is not clear at all why your version is supposed to be faster – 463035818_is_not_an_ai Nov 01 '17 at 08:28
@tobi303 this is because when I don't find the key in the `mid` index I need to check if the key is in the upper half or in the lower half, and depending on this, I need to change the high and low values and calculate the `mid` point for the upper or lower halves. however, this answer really solved the problem with the first point only. and I need to accept it as the solving answer but the second point is not correct for my situation. – Ennabah Nov 01 '17 at 08:35
@Gor Thanks, the first point solved my problem. However, the second point is incorrect. Please remove it so I can accept the answer. -Check my previous comment for the reason- – Ennabah Nov 01 '17 at 08:37
@MEnnabah If there is no overflow the two should yield the exact same result, I just dont see how this is supposed to improve performance – 463035818_is_not_an_ai Nov 01 '17 at 08:41
@MEnnabah: I don't get you concerns about `mid` computation. The one problem I see in the proposed change is the (theoretical) occurrence of overflow and undefined behaviour: *if and when* `INT_MAX < (unsigned long)low + high`. (There used to be a time when `size_t` would have been the advisable type for `low` and `high`.) – greybeard Nov 01 '17 at 08:47
@MEnnabah Given in your own comment, size is at most 100k - so an overflow cannot occur, and thus the proposed calculation is correct for the given use case. However, this saves just one single subtraction which on modern hardware would cost just some nanoseconds, so - see StoryTeller's first comment... – Aconcagua Nov 01 '17 at 08:50
@Aconcagua I already discussed with tobi303 why this calculation is NOT correct. see my comment. However, if you need to discuss it, let's do it in a chat. – Ennabah Nov 01 '17 at 08:57
@MEnnabah `low + (high - low)/2` <=> `(2*low + high - low)/2` <=> `(low + high)/2`... With real (mathematical) numbers, always correct. With integers, only if no overflow can occur, i. e. `low + high` not getting greater than `[U]INT_MAX` (if using unsigned int, overflow would not be UB, but still incorrect in this case). – Aconcagua Nov 01 '17 at 09:10
@Aconcagua you need to take into account changing `low` and `high` values in each iteration. – Ennabah Nov 01 '17 at 09:16
They will change in each iteration - but not during calculation of mid! `low` will have the same value within this single calculation (unless if this operation could be preempted *and* the preempting process, signal- or interrupt-handler might change it -- but then `low` would most likely be some global *volatile* variable). – Aconcagua Nov 01 '17 at 09:21

Aconcagua · Answer 3 · 2017-11-01T09:03:33.743

Passing the vector as const reference as mentioned already is a major point, using reserve another one. Not allocating the keys at all can give you some further performance, too:

sort(dict.begin(), dict.end());

int keysSize;
cin >> keysSize;

// this is a constant loop constraint, so move it out, too...
int size = (int)dict.size() - 1;

while (keysSize--)
{
    int val;
    cin >> val;

    if (val < dict[0] || val > dict[size])
    {
        cout << "-1" << ' ';
    }
    else
    {
        int res = iterBinarySearch(dict, 0, size, keys[i]);
        cout << res << ' ';
    }
}
return 0;

You can safe one additional function call:

cout << "-1 ";

Sure, won't give you too much, but that's so simple so I mention it anyway...

Just a side note: When dealing with values that cannot get negative by their nature (sizes, indices to arrays, etc) I would prefer the unsigned counter parts of the signed data types (unsigned int in your case). This will not have any impact on performance at all, as with modern two's complement architectures, exactly the same operations will be used (apart from some comparisons perhaps), just shows more clearly the intent and (part of) the valid range of the variable right away from the data type (one exception to mention: imagine you need int64_t for signed, but could do with uint32_t, and you have a 32-bit architecture, e. g. a micro controller – then you really get some minimal performance gain...).

I was afraid of using `unsigned` for such cases, but after your comment on the post, I believe I need to rethink of that. — Ennabah, Nov 03 '17 at 12:00

Efficient implementation Binary Search

3 Answers3