0

I'm trying to run this implementation of binary search. I don't know why but it keeps giving me segfault error. I'm thinking the problem might be either the way I'm passing the array or there's something wrong with the recursive calls.

#include <stdio.h>

int hasBinarySearch(int *array, int low, int high, int element)
{

    int mid = (low + (high-low)) / 2;

  if (high>=low)
  {

    if (array[mid] == element)
    {
      return mid;
    }
    else if(array[mid]<element)
    {
        return hasBinarySearch(array, low, mid-1, element);
    }
    else
    {
      return hasBinarySearch(array, mid+1, high, element);
    }
  }

  return 0;
}



int main(void)
{
  int array[10] = {1,2,3,4,5,6,6,6,7,8};
  hasBinarySearch(array, 0, 9, 2);

  return 0;
}
Neo
  • 47
  • 6
  • 2
    your calculation `int mid = (low + (high-low)) / 2;` doesn't reliably return an integer value; mid could end up being `0` and calling with `mid-1` (`0-1`) would be unexpected behavior. – Claies Nov 04 '18 at 04:38
  • 1
    `(low + (high-low))` is just `high`. So your recursive call is doing the same thing over and over until you run out of memory. – Barmak Shemirani Nov 04 '18 at 04:46
  • 1
    `int mid = (low + (high-low)) / 2;` --> `int mid = low + (high-low) / 2;` – chux - Reinstate Monica Nov 04 '18 at 04:46

3 Answers3

3

I think that you have some misunderstanding about binary search. Read some article or book about it.

As @Claies commented, calculation of middle index is wrong. It should be low + (high - low) / 2. Just think about the internal division of two points in mathematics.

Also, you have to fix the parameters on recursive calls like the code below.

#include <stdio.h>

int hasBinarySearch(int *array, int low, int high, int element)
{

    int mid = low + (high - low) / 2; // changed
    printf("%d %d\n", high, low);
    if (high >= low)
    {
        if (array[mid] == element)
        {
            return mid;
        }
        else if (array[mid] < element)
        {
            return hasBinarySearch(array, mid + 1, high, element); // changed
        }
        else
        {
            return hasBinarySearch(array, low, mid - 1, element); // changed
        }
    }
    return 0;
}

int main(void)
{
    int array[10] = { 1,2,3,4,5,6,6,6,7,8 };
    hasBinarySearch(array, 0, 9, 2);
    return 0;
}
paganinist
  • 150
  • 1
  • 9
  • 1
    In fact, the `mid` calculation can be simply `mid = (low + high) / 2;` – ihavenoidea Nov 04 '18 at 05:02
  • 2
    @ihavenoidea: [Extra, Extra – Read All About It: Nearly All Binary Searches and Merge Sorts are Broken](https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html) – Jonathan Leffler Nov 04 '18 at 05:04
  • Look at that, I've been doing binary search wrong all my life. Thanks for the info @JonathanLeffler – ihavenoidea Nov 04 '18 at 05:06
  • @JonathanLeffler The sample code [there](https://ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html) indexes with `int` vs. `size_t`. If code is concerned about `int` overflow with extreme values, it should first use the right indexing type. And given unsigned math, `mid - 1` is then a concern. – chux - Reinstate Monica Nov 04 '18 at 05:40
  • @chux: yes, but even `size_t` can, in theory, overflow (though that's more likely with 32-bit than 64-bit). It's mainly pointing out that the `(low + (high - low) / 2)` calculation is not inherently bad. – Jonathan Leffler Nov 04 '18 at 05:43
  • @JonathanLeffler True your comment is good, yet the article's complaint about "Nearly All Binary Searches and Mergesorts are Broken", touts itself as a solution and yet itself still commits an error that is in the extreme range camp deserved being identified for its weakness. – chux - Reinstate Monica Nov 04 '18 at 05:48
2
int mid = (low + (high-low)) / 2; // wrong formula

@paganinist good answer points out the flaws in OP's search method and with a fix.


Yet to dig deeper.

Even though some compilers might be able to "un-recurse" code (Example), recursion is not needed here. A simple loop will suffice.

Array sizes can approach near maximum or exceed the range of int in extreme cases.
For sizes in the high int range, the following is better. @Jonathan Leffler

// int mid = (low + high)/2;  // could overflow
int mid = low + (high-low)/2; // better, will not overflow when low >= 0

To accommodate all array sizes, use size_t instead on int. This also handles sizes including those near and above INT_MAX.


Candidate solution that returns the address of the matching element or NULL if not found.

#include <stdlib.h>
#include <stdio.h>

int *BinarySearch_int(const int *array, size_t count, int key) {
  while (count > 0) {
    size_t mid = count / 2;
    if (key > array[mid]) {
      array += mid + 1;
      count -= mid + 1;
    } else if (key < array[mid]) {
      count = mid;
    } else {
      return (int *) &array[mid];
    }
  }
  return NULL;
}

Test code

bool BinarySearch_int_test(const int *array, size_t count, int key, bool target){
  int *p = BinarySearch_int(array, count, key);
  bool success = (p != NULL) == target && (p == NULL || *p == key);
  printf("f(Array:%p count:%zu, key:%2d) --> ptr:%p value:%2d success:%d\n",
      (void*) array, count, key, (void*) p, p ? *p : 0, success);
  return success;
}

int main(void) {
  int array[] = {10, 20, 30, 40, 50, 60};
  size_t n = sizeof array / sizeof array[0];
  for (size_t i = 0; i < n; i++) {
    BinarySearch_int_test(array, n, array[i], 1);
  }
  BinarySearch_int_test(array, n, 0, 0);
  for (size_t i = 0; i < n; i++) {
    BinarySearch_int_test(array, n, array[i] + 1, 0);
  }
}

Output

f(Array:0xffffcb90 count:6, key:10) --> ptr:0xffffcb90 value:10 success:1
...
f(Array:0xffffcb90 count:6, key:60) --> ptr:0xffffcba4 value:60 success:1
f(Array:0xffffcb90 count:6, key: 0) --> ptr:0x0 value: 0 success:1
...
f(Array:0xffffcb90 count:6, key:61) --> ptr:0x0 value: 0 success:1
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
1

mid's calculation simplifies to high / 2 because you've added and then subtracted the lower bound out again. It looks like you meant to add half the difference to the lower bound, but the division occurs too late. It should be low + (high-low) / 2. (This is a bit more complicated than (low + high) / 2 but avoids the integer-math problem mentioned elsewhere.)

I think that segfault is happening when high goes below low and gets too small and you fall off the beginning of the array.

And @paganinist is right about the upper and lower cases being backwards.

gws
  • 459
  • 1
  • 7
  • 16