Optimize Binary Search Algorithm

Question

In a binary search, we have two comparisons one for greater than and other for less than, otherwise its the mid value. How would you optimize so that we need to check only once?

bool binSearch(int array[], int key, int left, int right)
{

    mid = left + (right-left)/2;
    if (key < array[mid])
        return binSearch(array, key, left, mid-1);
    else if (key > array[mid])
        return binSearch(array, key, mid+1, right);
    else if (key == array[mid])
        return TRUE; // Found

    return FALSE; // Not Found
}

Is there a reason not to use bsearch()? – quinmars Mar 23 '09 at 15:34 — quinmars, Mar 23 '09 at 15:34

score 17 · Accepted Answer · edited Mar 23 '09 at 18:47

17

I would try the bsearch() standard function first. Chances are good that it performes better than your approach.

edited Mar 23 '09 at 18:47

Svante

50,694
11
78
122

answered Mar 23 '09 at 15:42

quinmars

11,175
8
32
41

glibc implementation of bsearch() looks unoptimized (ftp://ftp.gnu.org/gnu/glibc/) – sambowry Oct 06 '09 at 07:17
1

this looks quite decent to me, as an iterative implementation http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=stdlib/bsearch.c;hb=HEAD – Hasturkun Oct 06 '09 at 16:40
@sambowry, how would you optimize it? – quinmars Oct 06 '09 at 17:32
@quinmars, a good book describe the optimization process http://www.cs.bell-labs.com/cm/cs/pearls/ – sambowry Oct 06 '09 at 19:04
2

@sambowry, sorry, I don't have that book, and I won't buy it just to see how he is implementing a binary search. – quinmars Oct 06 '09 at 19:14
@quinmars: Use your local library. – jason Oct 07 '09 at 15:52
@Hasturkun: and more importantly, it's defined in `` so it can inline and optimize away the indirection through the callback function. (Modern compilers can inline away function pointers when the pointer is a compile-time constant.) This is a major advantage over stdlib `qsort` which does actually have to `call` even a trivial tiny compare function passing it pointers, not values, even for simple types. – Peter Cordes Mar 08 '19 at 22:18

score 8 · Answer 2 · edited Jun 20 '20 at 09:12

It's not advisable to try and optimize in the way you describe. From the Binary Search Algorithm article on Wikipedia:

About half the time, the first test will be true so that there will be only one comparison of a and b, but the other half of the time it will be false, and a second comparison forced. This is so grievous that some versions are recast so as not to make a second test at all thus not determining equality until the span has been reduced to zero, and thereby foregoing the possibility of early termination – remember that about half the time the search will happen on a matching value one iteration short of the limit.

It is quite easy to make this problem still worse (e.g. as in 3) by using an order such as

if a = b then action3
 else if a > b then action2
  else action1;

Rather than detecting equality early (as it might appear to), this will force two comparisons to be performed for all but the last iteration of a search.

Some languages, like Fortran, have a three-way comparison that allows this step to be done with one comparison that branches to three different sections (see the tenth line of the Three-way comparison example). If your language doesn't support a three-way test (most languages don't) then two comparisons is the best you can do.

I would advise you to check out the iterative implementation from the same article, though.

I did the equality test first and it was faster than equality test last (probably modern superscalar processor weirdness). You may want to check Knuth ex 6.2.1 23 where he states the one way comparison is only faster for N> 2**36 — paperhorse, Jul 01 '09 at 05:14

sambowry · Answer 3 · 2009-10-08T01:26:24.857

I tried to reconstruct the optimization steps of the binary search algorithm. I start with this iterative version:

int binsearch_1( arr_t array[], size_t size, arr_t key, size_t *index ){
  if( !array || !size ) return 0;
  arr_t *p=array;
  int found=0;
  while( size > 0 ){
    size_t w=size/2;
         if( p[w] <  key  ){ p+=w+1; size-=w+1; }
    else if( p[w] >  key  ){         size =w;   }
    else  /* p[w] == key */{ p+=w; found=1; break; }
  }
  *index=p-array; return found;
}

Eliminating comparisons from the loop:

int binsearch_2( arr_t array[], size_t size, arr_t key, size_t *index ){
  if( !array || !size ) return 0;
  arr_t *p=array;
  while( size > 0 ){
    size_t w=size/2;
    if( p[w+1] <= key ){ p+=w+1; size-=w+1; } else size =w;
  }
  *index=p-array; return p[0]==key;
}

Unrolling small sizes from the loop:

int binsearch_3( arr_t array[], size_t size, arr_t key, size_t *index ){
  if( !array || !size ) return 0;
  arr_t *p=array;
  while( size > 8 ){
    size_t w=size/2;
    if( p[w+1] <= key ){ p+=w+1; size-=w+1; } else  size =w;
  }
  if( size==8 ){ if( p[5] <=  key ){ p+=5; size=3; } else size=4; }
  if( size==7 ){ if( p[4] <=  key ){ p+=4; size=3; } else size=3; }
  if( size==6 ){ if( p[4] <=  key ){ p+=4; size=2; } else size=3; }
  if( size==5 ){ if( p[3] <=  key ){ p+=3; size=2; } else size=2; }
  if( size==4 ){ if( p[3] <=  key ){ p+=3; size=1; } else size=2; }
  if( size==3 ){ if( p[2] <=  key ){ p+=2; size=1; } else size=1; }
  if( size==2 ){ if( p[2] <=  key ){ p+=2; size=0; } else size=1; }
  if( size==1 ){ if( p[1] <=  key ){ p+=1;         }              }
  *index=p-array; return p[0]==key;
}

Reordering if statements, moving special cases [size==pow(2,N)-1] to the end:

int binsearch_4( arr_t array[], size_t size, arr_t key, size_t *index ){
  if( !array || !size ) return 0;
  arr_t *p=array;
  while( size > 8 ){
    size_t w=size/2;
    if( p[w+1] <= key ){ p+=w+1; size-=w+1; } else  size =w;
  }
  if( size==8 ){ if( p[5] <=  key ){ p+=5; size=3; } else size=4; }
  if( size==6 ){ if( p[4] <=  key ){ p+=4; size=2; } else size=3; }
  if( size==5 ){ if( p[3] <=  key ){ p+=3; size=2; } else size=2; }
  if( size==4 ){ if( p[3] <=  key ){ p+=3; size=1; } else size=2; }
  if( size==2 ){ if( p[2] <=  key ){ p+=2; size=0; } else size=1; }

  if( size==7 ){ if( p[4] <=  key ){ p+=4; size=3; } else size=3; }
  if( size==3 ){ if( p[2] <=  key ){ p+=2; size=1; } else size=1; }
  if( size==1 ){ if( p[1] <=  key ){ p+=1;         }              }
  *index=p-array; return p[0]==key;
}

Changing if statements to a switch statement:

int binsearch_5( arr_t array[], size_t size, arr_t key, size_t *index ){
  if( !array || !size ) return 0;
  arr_t *p=array;
  while( size > 8 ){
    size_t w=size/2;
    if( p[w+1] <= key ){ p+=w+1; size-=w+1; } else  size =w;
  }
  if( size==8 ){ if( p[5] <=  key ){ p+=5; size=3; } else size=4; }
  if( size==6 ){ if( p[4] <=  key ){ p+=4; size=2; } else size=3; }
  if( size==5 ){ if( p[3] <=  key ){ p+=3; size=2; } else size=2; }
  if( size==4 ){ if( p[3] <=  key ){ p+=3; size=1; } else size=2; }
  if( size==2 ){ if( p[2] <=  key ){ p+=2; size=0; } else size=1; }

  switch(size){
    case 7: if( p[4] <= key ) p+=4;
    case 3: if( p[2] <= key ) p+=2;
    case 1: if( p[1] <= key ) p+=1;
  }
  *index=p-array; return p[0]==key;
}

Extending switch statement to handle all of the special cases:

int binsearch_6( arr_t array[], size_t size, arr_t key, size_t *index ){
  if( !array || !size ) return 0;
  arr_t *p=array;
  switch( size ){
#define C(n) case ((size_t)1<<n)-1: if( p[1<<(n-1)]<=key ) p+=1<<(n-1); 
#if SIZE_MAX == UINT64_MAX 
                                              C(63) C(62) C(61)
    C(60) C(59) C(58) C(57) C(56) C(55) C(54) C(53) C(52) C(51)
    C(50) C(49) C(48) C(47) C(46) C(45) C(44) C(43) C(42) C(41)
    C(40) C(39) C(38) C(37) C(36) C(35) C(34) C(33) C(32)
#endif 
                                                          C(31)
    C(30) C(29) C(28) C(27) C(26) C(25) C(24) C(23) C(22) C(21)
    C(20) C(19) C(18) C(17) C(16) C(15) C(14) C(13) C(12) C(11)
    C(10) C( 9) C( 8) C( 7) C( 6) C( 5) C( 4) C( 3) C( 2) C( 1)
#undef C 
      break;
    default:
      while( size > 0 ){
        size_t w=size/2;
        if( p[w] < key ){ p+=w+1; size-=w+1; } else size=w;
      }
  }
  *index=p-array; return p[0]==key;
}

Eliminating the general case handling from the code: [ w is the smallest number: w==pow(2,N)-1; size<=2*(w+1) ]

int binsearch_7( arr_t array[], size_t size, arr_t key, size_t *index ){
  if( !array || !size ) return 0;
  arr_t *p=array;
  size_t w=(size-1)>>1; w|=w>>1; w|=w>>2; w|=w>>4; w|=w>>8; w|=w>>16;
#if SIZE_MAX == UINT64_MAX
  w|=w>>32;
#endif
  if( p[w]<key ) p+=size-w-1;
  switch( w ){
#define C(n) case ((size_t)1<<n)-1: if( p[1<<(n-1)]<=key ) p+=1<<(n-1);
#if SIZE_MAX == UINT64_MAX
                                              C(63) C(62) C(61)
    C(60) C(59) C(58) C(57) C(56) C(55) C(54) C(53) C(52) C(51)
    C(50) C(49) C(48) C(47) C(46) C(45) C(44) C(43) C(42) C(41)
    C(40) C(39) C(38) C(37) C(36) C(35) C(34) C(33) C(32)
#endif
                                                          C(31)
    C(30) C(29) C(28) C(27) C(26) C(25) C(24) C(23) C(22) C(21)
    C(20) C(19) C(18) C(17) C(16) C(15) C(14) C(13) C(12) C(11)
    C(10) C( 9) C( 8) C( 7) C( 6) C( 5) C( 4) C( 3) C( 2) C( 1)
#undef C
  }
  *index=p-array; return p[0]==key;
}

The last step what i made was the simplifying of the case labels [from: '((size_t)1<<n)-1' to: 'n'] but i found that the modified code was slower on my old PC than the previous version.

score 4 · Answer 4 · answered Mar 23 '09 at 15:26

4

If you want to optimize your binary search algorithm you must replace recursion with iteration. See examples at wikipedia.

Let me know if you need more details.

answered Mar 23 '09 at 15:26

mgamer

13,580
25
87
145

score 4 · Answer 5 · answered Mar 23 '09 at 15:37

In the code you posted, you can remove the last comparison. That is, if key is not less than array[mid] or greater than array[mid], then by definition it's equal. So your code becomes:

mid = left + (right-left)/2;
if (key < array[mid])
    return binSearch(array, key, left, mid-1);
else if (key > array[mid])
    return binSearch(array, key, mid+1, right);
else 
    return TRUE; // Found

Also note that I've removed the last line. In your original code, it's impossible for the return FALSE; ever to be executed. I would assume that you have a check at the beginning of your binSearch which checks to see if left <= right.

In C, there is no way to do a three-way comparison/branch based on less than, equal, greater than, so you have to do two comparisons to select among three possibilities.

score 3 · Answer 6 · answered Mar 07 '10 at 10:17

// Optimized Binary Search Implementation
int binsearch(int* data, int size, int val)
{
    int result = -1;
    int start, mid, end;
    start = 0;
    end = size - 1;
    mid = (start + end) >> 1;
    while (start <= end && data[mid] != val)
    {
        if (data[mid] > val)
            end = mid - 1;
        else
            start = mid + 1;
        mid = (start + end) >> 1;
    }
    if (val == data[mid])
        result = mid;
    return result;
}

score 2 · Answer 7 · answered Oct 05 '09 at 00:54

Ganesh M - If the key does not exist in the array, then your function will be stuck inside of a never ending loop. It can never return FALSE.

What is the optimal way to find the insertion point, if the key does not exist?

A conditional "if cascade" accounting for <, ==, and > will only return TRUE, or continue to compute forever.

I need the optimal condition to determine when a key has been isolated as not existing.

I know that this will slow down the search, but I want to slow it down by the least amount.

playing with log(N)/log(2) seems like a good idea, but when N is not a power of 2, things can go haywire.

Perhaps, I should choose an array with a size of power of 2, and use a simple while loop?

does checking if the midpoint == to either bound increase the number of comparisons too much?

score 2 · Answer 8 · answered Mar 23 '09 at 15:30

2

For integers it doesn't matter, don't do it.

For more expensive comparisons use -1, 0, 1 for <, =, > like in the C library function strcmp or Java's compareTo().

answered Mar 23 '09 at 15:30

starblue

55,348
14
97
151

actually, I think it is worth it for integers as it potentially removes two array accesses. You'd have to profile to be sure, of course. – Mar 23 '09 at 15:35
3

@Neil If the array were accessed more than once that would be a really lousy compiler. – starblue Mar 23 '09 at 15:37

score 1 · Answer 9 · answered May 29 '10 at 11:02

Its an exercise question in K&R second edition.

While(low <= high && arr[mid]!=num)
{
    if(arr[mid] > num)
    {
       low = mid+1;
    }
    else
    {
       high = mid-1;
    }
    mid = (low+high)/2;
}

if(arr[mid] == num)
{
    printf("found the ugly number");
    return mid;
}

jjxtra · Answer 10 · 2019-09-09T15:42:07.447

I've done some performance testing in C# with branchless version vs. normal branching version. The branching version always wins. I've tried to make the performance test produce random worst case results to hurt branching for found or not found results but it's possible it is not completely worst case scenario. This all makes sense, thanks to the great link provided by Peter Cordes at https://stackoverflow.com/a/54273248.

// choose on of these three...
#define BRANCHLESS_UNSAFE // branchless binary search
//#define BRANCHLESS_FROM_SOURCE // copy source from Array.BinarySearch
//#define BRANCHLESS_BUILTIN // Array.BinarySearch

// BRANCHLESS_UNSAFE is on average 4-5 ms slower than BRANCHLESS_BUILTIN on threadripper 1950x

using System;
using System.Diagnostics;
using System.Linq;

namespace BranchlessBinarySearchNamespace
{
    /// <summary>
    /// C# branchless binary search performance test
    /// The built in array binary search is currently faster in release mode, it's possible the compiler is smarter than me :)
    /// </summary>
    public static unsafe class BranchlessBinarySearch
    {
        // if index not found, returns ~(index of element to insert sorted) at.
        public static int BinarySearch<T>(T[] array, T value) where T : IComparable<T>
        {
            int cmp;
            int mid;
            int start = 0;
            int end = array.Length - 1;

#if BRANCHLESS_UNSAFE

            int** ptrArray = stackalloc int*[3];
            ptrArray[0] = &start;
            ptrArray[2] = &end;
            int g0;

#endif

            while (start <= end)
            {
                mid = start + ((end - start) >> 1);
                cmp = array[mid].CompareTo(value);
                if (cmp == 0)
                {
                    return mid;
                }

#if BRANCHLESS_UNSAFE

                // use -1 or 1 to access start or end in array of 3 ptr

                // compiler will turn into non branch most likely
                //g0 = cmp > 0 ? 1 : -1;

                // access sign bit, negate, multiply by 2 and add one. This gives us -1 if less than 0 or 1 if greater than 0.
                g0 = (-(int)((uint)cmp >> 31) * 2) + 1; 

                // index into ptrArr, if g0 is -1 (using end ptr) we access index 0 (1 + -1) which is where end ptr is
                // if g0 is 1 (using start ptr) we access index 2 (1 + 1) which is where start ptr is
                *(ptrArray[1 + g0]) = mid - g0;

#else

                // the above two lines equivelant to this:
                // compiler may be able to optimize this, but the above code is simply an illustration on how to do
                // the binary search logic without any branching logic at all
                else if (cmp > 0)
                {
                    end = mid - 1;
                }
                else
                {
                    start = mid + 1;
                }

#endif

            }
            return ~start;
        }

        private static unsafe void Test(int[] items, int[] find, bool output)
        {
            if (output)
            {
                const bool testBranchless =

#if BRANCHLESS_UNSAFE || BRANCHLESS_FROM_SOURCE

                true;

#else

                false;

#endif

                Console.Write("Testing {0} items, branchless: {1}... ", items.Length, testBranchless);
            }
            Stopwatch sw = Stopwatch.StartNew();

#if BRANCHLESS_UNSAFE || BRANCHLESS_FROM_SOURCE

            for (int i = 0; i < find.Length; i++)
            {
                BinarySearch(items, find[i]);
                //int idx = BinarySearch(items, find[i]);
                //if (idx != Array.BinarySearch(items, find[i]))
                {
                    //throw new ApplicationException("Binary search invalid result");
                }
            }

#else

            for (int i = 0; i < find.Length; i++)
            {
                Array.BinarySearch(items, find[i]);
                //int idx = Array.BinarySearch(items, find[i]);
                //if (idx != Array.BinarySearch(items, find[i])) // ensure same perf characteristic as branchless test
                {
                    //throw new ApplicationException("Binary search invalid result");
                }
            }

#endif

            if (output)
            {
                Console.WriteLine("Time: {0:0.00} ms", sw.Elapsed.TotalMilliseconds);
            }
        }

        private static unsafe void Test(int count, bool output = true)
        {
            Random r = new Random();
            int[] items = new int[count];
            for (int i = 0; i < items.Length; i++)
            {
                items[i] = r.Next(0, items.Length);
            }
            Array.Sort(items);
            int[] find = new int[1024];
            for (int i = 0; i < 1024; i++)
            {
                int found = r.Next(0, 11);
                if (found > 6)
                {
                    // found
                    find[i] = items[r.Next(0, items.Length)];
                }
                else
                {
                    // not found
                    find[i] = (r.Next(0, 2) == 0 ? r.Next(int.MinValue, 0) : r.Next(items.Length + 1, int.MaxValue));
                }
            }
            Test(items, find, output);
        }

        public static void Main()
        {
            for (int i = 0; i < 10; i++)
            {
                for (int j = 10; j <= 10000000; j *= 10)
                {
                    Test(j);
                }
            }
        }
    }
}

You could do `end = cmp>0 ? mid-1 : end` to make it branchless. That's not always best, though, if the data is cold in cache; speculative execution can effectively trigger prefetch and avoid a data dependency chain of cache misses. [About the branchless binary search](//stackoverflow.com/a/54273248) has more, and shows a way of writing it that only needs one `cmov` / other branchless ALU select. — Peter Cordes, Mar 08 '19 at 22:08
Updated with pointer array. It’s possible the compiler is smarter but this way absolutely ensures there is no inner branch at the cost of an array index and add instruction. — jjxtra, Sep 09 '19 at 05:17

score 0 · Answer 11 · answered Jun 07 '14 at 13:04

bool binSearch(int array[], int key, int len)
{
    int mid, low, high;

    low = 0;
    right = len - 1;
    mid = low + (high-low)/2;

    while ((low <= high) && (array[mid] != key)) {
         if (key < array[mid]) {
             high = mid - 1;
         } else {
             low = mid + 1;
         }
         mid = low + (high-low)/2;
    }

    if (array[mid] == key) {
        return TRUE;
    }
    return FALSE;
}

score -1 · Answer 12 · answered Jul 05 '12 at 18:52

-1

BinarySearch = function (array, value)  
{  
    var l = 0;  
    var r = array.length;  
    var mid;  
    while (l!=r)  
    {  
        mid = Math.round( (r+l)/2 )-1;  
        if(value > array[mid])  
            l=mid+1;  
        else  
            r=mid;  
    }  
    if(array[mid]==value)  
        return mid;  
    else  
    {  
        if( (mid < (array.length-1)) && (array[mid+1]==value) )  
            return mid+1;  
        else  
            return -1; // Not found  
    }  
}

source: http://calcs.appspot.com/docview?id=agVjYWxjc3IPCxIIRG9jdW1lbnQY0w8M

answered Jul 05 '12 at 18:52

Hardbug

15
1
4

What is that language - hardly looks like C to me... copy 'n' paste not good enough.... – t0mm13b Oct 10 '12 at 19:36
no such big prblem that is not C, but it doesn not show anything new? – Nick Jul 31 '15 at 20:55

Optimize Binary Search Algorithm

12 Answers12