14

I have implemented a very simple binarySearch implementation in C# for finding integers in an integer array:

Binary Search

static int binarySearch(int[] arr, int i)
{
    int low = 0, high = arr.Length - 1, mid;

    while (low <= high)
    {
        mid = (low + high) / 2;

        if (i < arr[mid])
            high = mid - 1;

        else if (i > arr[mid])
            low = mid + 1;

        else
            return mid;
    }
    return -1;
}

When comparing it to C#'s native Array.BinarySearch() I can see that Array.BinarySearch() is more than twice as fast as my function, every single time.

MSDN on Array.BinarySearch:

Searches an entire one-dimensional sorted array for a specific element, using the IComparable generic interface implemented by each element of the Array and by the specified object.

What makes this approach so fast?

Test code

using System;
using System.Diagnostics;

class Program
{
    static void Main()
    {
        Random rnd = new Random();
        Stopwatch sw = new Stopwatch();

        const int ELEMENTS = 10000000;
        int temp;

        int[] arr = new int[ELEMENTS];

        for (int i = 0; i < ELEMENTS; i++)
            arr[i] = rnd.Next(int.MinValue,int.MaxValue);

        Array.Sort(arr);

        // Custom binarySearch

        sw.Restart();
        for (int i = 0; i < ELEMENTS; i++)
            temp = binarySearch(arr, i);
        sw.Stop();

        Console.WriteLine($"Elapsed time for custom binarySearch: {sw.ElapsedMilliseconds}ms");

        // C# Array.BinarySearch

        sw.Restart();
        for (int i = 0; i < ELEMENTS; i++)
            temp = Array.BinarySearch(arr,i);
        sw.Stop();

        Console.WriteLine($"Elapsed time for C# BinarySearch: {sw.ElapsedMilliseconds}ms");
    }

    static int binarySearch(int[] arr, int i)
    {
        int low = 0, high = arr.Length - 1, mid;

        while (low <= high)
        {
            mid = (low+high) / 2;

            if (i < arr[mid])
                high = mid - 1;

            else if (i > arr[mid])
                low = mid + 1;

            else
                return mid;
        }
        return -1;
    }
}

Test results

+------------+--------------+--------------------+
| Attempt No | binarySearch | Array.BinarySearch |
+------------+--------------+--------------------+
|          1 | 2700ms       | 1099ms             |
|          2 | 2696ms       | 1083ms             |
|          3 | 2675ms       | 1077ms             |
|          4 | 2690ms       | 1093ms             |
|          5 | 2700ms       | 1086ms             |
+------------+--------------+--------------------+
Daniel
  • 10,641
  • 12
  • 47
  • 85
  • 8
    You can [take a look](http://referencesource.microsoft.com/#mscorlib/system/array.cs,b92d187c91d4c9a9) at the source for hints. – Blorgbeard Aug 08 '16 at 20:44
  • 4
    Did you run your test in Release outside of VS? – Ivan Stoev Aug 08 '16 at 20:45
  • 6
    I just did, and on my machine your `binarySearch` is about 2.5 times **faster** than the `Array` one. – Ivan Stoev Aug 08 '16 at 20:57
  • Probably because they wrote better (i.e., more performant) code than yours. Analyze the code @Blorgbeard linked – GreatAndPowerfulOz Aug 08 '16 at 20:58
  • 2
    @IvanStoev me too, same result. I'm guessing hard-coding the data-type to `int` makes for much faster comparisons. – Blorgbeard Aug 08 '16 at 21:02
  • Note also that while theoretically a non-issue, using truly random input (rather than e.g. a constant seed value) can result in different results on different machines. Daniel, at a minimum you need to provide a good [mcve] that will reliably reproduce your results elsewhere. Even better, describe what you've already done to investigate and state what _specific_ aspect still confuses you. Use a profiler to determine what the time-consuming areas of each implementation are, for comparison. – Peter Duniho Aug 08 '16 at 21:10
  • Same result for me, release code is much faster. You could gain more performance by replacing `mid = (low + high) / 2` with `mid = (low + high) >> 1` – Bastian Thiede Aug 08 '16 at 21:55
  • @BastianThiede as a bonus, you can make it more correct than the original, by adding some free casts: `mid = (int)unchecked((uint)(low + high) >> 1)` – harold Aug 08 '16 at 22:34
  • You are comparing a generic BinarySearch against a custom BinarySearch specifically designed for `int` only. Isn't it a bit flawed by not having to rely on some `Comparer` ? – Sehnsucht Aug 08 '16 at 22:52
  • Why not checking the performance @Sehnsucht ? If he knows the actual type and performance is an issue, there is no need to rely on a generic `Comparer` – Bastian Thiede Aug 08 '16 at 22:57
  • @BastianThiede I didn't say he (or anyone) can't check performance. – Sehnsucht Aug 08 '16 at 23:03

1 Answers1

16

Your code is faster when run outside Visual Studio:

Yours vs Array's:

From VS - Debug mode: 3248 vs 1113
From VS - Release mode: 2932 vs 1100
Running exe - Debug mode: 3152 vs 1104
Running exe - Release mode: 559 vs 1104

Array's code might be already optimized in the framework but also does a lot more checking than your version (for instance, your version may overflow if arr.Length is greater than int.MaxValue / 2) and, as already said, is designed for a wide range of types, not just int[].

So, basically, it's slower only when you are debugging your code, because Array's code is always run in release and with less control behind the scenes.

Andrew
  • 7,602
  • 2
  • 34
  • 42
  • Can you comment the line: Array.Sort(arr) and post your result again. – M.Hassan Aug 08 '16 at 23:10
  • 2
    @M.Hassan binary search requires a sorted array as input. The sort is not being included in the timings. – Blorgbeard Aug 09 '16 at 01:59
  • 1
    Very interesting answer and I can reproduce the same results. Marked as accepted – Daniel Aug 09 '16 at 05:40
  • This answer contradicts itself. it says that "it's slower only when [not in release]", but the numbers posted show that *in Release Mode*, in VS, the performance is about as slow as in Debug Mode: so what accounts for *that*? – user2864740 Nov 04 '18 at 02:53
  • By "debugging" I was referring to either executing a binary in "debug mode" or running the code from within Visual Studio, which can still be considered "debugging" even using "release mode". – Andrew Nov 05 '18 at 00:34